Column Store Index

Published on January 2022 | Categories: Documents | Downloads: 21 | Comments: 0 | Views: 137

of 7

Content

Introduction The SQL Server 2012 introduces a new data warehouse query que ry acceleration feature based on a new type of index called the

columnstore.

This new index, cobined

with enhanced query optii!ation and execution e xecution features, iproves data warehouse query perforance by hundreds to thousands of ties in soe cases, and can routinely "ive a tenfold speedup for a broad ran"e of queries fittin" the scenario for which it was desi"ned. de si"ned. #t does all this within the failiar T$SQL query lan"ua"e, and the pro"rain" and syste ana"eent environent of SQL Server.. #t%s thus fully copatible with all reportin" solutions that run as clients of Server SQL Server, includin" SQL Server &eportin" Services. ' colunstore index stores each colun in a separate set of dis( pa"es, rather than storin" ultiple rows per pa"e as data traditionally has been stored. )e use the ter *row store+ to describe either a heap or a $tree that contains ultiple rows per pa"e. The difference between colun store and row store approaches is illustrated belowFigure

The coluns 1/ are stored in different "roups of pa"es in the colunstore index. enefits of this are-

•

only the coluns needed to solve a query are fetched fro dis( this is often fewer than 13 of the coluns in a typical fact table4,

•

it%s easier to copress the data due to the redundancy of data within a colun, and

•

buffer hit rates are iproved because data is hi"hly copressed, and frequently accessed parts of coonly used coluns reain in eory, while infrequently used parts are pa"ed out.

The colunstore index in SQL Server eploys 5icrosoft%s patented 6ertipaq7 6ertipaq7 technolo"y, which it shares with SQL Server 'nalysis Services and 8ower8ivot. SQL Server colunstore indexes don%t have h ave to fit in ain eory, but they can effectively use as uch eory as is available on the server. server. 8ortions 8ortions of coluns are oved in and out of eory on deand. SQL Server colunstore indexes are *pure+ colun stores, not a hybrid, because they store all data for separate coluns on separate pa"es. This iproves #9: scan perforance and buffer hit rates. SQL Server is the first a;or database product to support a pure colunstore index.

Using Columnstore Indexes To iprove query perforance, all you need to do is build a colunstore index on the fact tables in a data warehouse. #f you have extreely lar"e diensions say ore than 10 illion rows4 then you ay wish to build a colunstore index on those diensions as well. 'fter that, you siply subit queries to SQL Server, and they can run uch, uch faster. <or exaple, The catalo"=sales fact table in this database database contains 1.>> billion billion rows. The followin" stateent was used to create a colunstore index that includes all the coluns of the table&?'T? :L@5AST:&? #AB?C cstore on DdboE.Dcatalo"=salesE Dcs=sold=date=s(E ,Dcs=sold=tie=s(E ,Dcs=ship=date=s(E ,Dcs=bill=custoer=s(E ,Dcs=bill=cdeo=s(E ,Dcs=bill=hdeo=s(E ,Dcs=bill=addr=s(E

,Dcs=ship=custoer=s(E ,Dcs=ship=cdeo=s(E ,Dcs=ship=hdeo=s(E ,Dcs=ship=addr=s(E ,Dcs=call=center=s(E ,Dcs=catalo"=pa"e=s(E ,Dcs=ship=ode=s(E ,Dcs=warehouse=s(E ,Dcs=ite=s(E ,Dcs=proo=s(E ,Dcs=order=nuberE ,Dcs=quantityE ,Dcs=wholesale=costE ,Dcs=list=priceE ,Dcs=sales=priceE ,Dcs=ext=discount=atE ,Dcs=ext=sales=priceE ,Dcs=ext=wholesale=costE ,Dcs=ext=list=priceE ,Dcs=ext=taxE ,Dcs=coupon=atE ,Dcs=ext=ship=costE ,Dcs=net=paidE ,Dcs=net=paid=inc=taxE ,Dcs=net=paid=inc=shipE ,Dcs=net=paid=inc=ship=taxE ,Dcs=net=profitE4

Performance Characteristics olunstore index query processin" is ost heavily optii!ed for star ;oin queries, but any types of queries can benefit. <act$to$fact <act$to$fact table ;oins and ulti$colun ;oin queries ay benefit less fro colunstore indexes, or not a tall. :LT8$style :LT8$style queries, includin" point loo(ups, and fetches of every colun of a wide row, will usually not perfor as well with a colunstore index as with a $tree index.

olunstore indexes don%t always iprove data warehouse query perforance. )hen they don%t, norally, the query optii!er will choose to use a heap or o r $tree to access the data. #f the optii!er chooses the colunstore index when in fact usin" the underlyin" heap or $tree perfors better for a query, the developer can use hints to tune the query to use the heap or $tree instead.

Loading Data

The tables with colunstore indexes can%t can% t be updated directly usin" #AS?&T, #AS?&T, @8B'T?, @8B'T ?, B?L?T?, and 5?&F? stateents, or bul( load operations. To ove data into a colunstore table you can switch in a partition, or disable the colunstore index, update the table, and rebuild the index. olunstore indexes on partitioned tables ust be partition$ali"ned. 5ost data warehouse custoers have a daily load cycle, and treat the data warehouse as read$only durin" the day, so they%ll alost certainly be able to use colunstore indexes. ' second liitation is that colunstore indexes are nonclustered indexes, so they still require the ain table, which could be either a clustered index or a heap. This ainly eans that youGll end up with two copies of the sae data. 5icrosoft has said that this liitation will "o away in a future release of SQL Server, Server, which which will have a colunstore index as the ain table. <inally, soe data types arenGt allowed. 'ccordin" to SQL Server 2012 &0 oo(s :nline :L4, the followin" data types canGt be used in a colunstore index•

binary and varbinary

•

ntext, text, and ia"e

•

varcharax4 and nvarcharax4

•

uniqueidentifier

•

rowversion and tiestap4

•

sql=variant

•

decial and nueric4 with precision "reater than 1H di"its

•

datetieoffset with scale "reater than 2

•

L& types hierarchyid and spatial types4

•

xl

Iou can also create a view that uses @A#:A 'LL to cobine a table with a colun store index and an updatable table without a colunstore index into one lo"ical table. This view can then be referenced by queries. This allows dynaic insertion of new data into a sin"le lo"ical fact table while still retainin" uch of the perforance benefit of colunstore capability. 'll tables that don%t have colunstore indexes reain fully updateable. This allows you to, for exaple, create a diension table on the fly and then use it in successive queries by ;oinin" it to the colun store$structured fact table. This can be useful, for exaple, when a retail analyst wants to put, say, about 1000 products into a study "roup, and then run repeated queries for that study "roup. The #Bs of these products can be placed into a study "roup diension table. This table can then be ;oined to the colunstore$structured fact table. #ndex build ties for a colunstore index have been observed to be 2 to J ties lon"er than the tie to build a clustered $tree index on the sae data, on a pre$ release build.

Benefits of Columnstore Indexes The priary benefit of colunstore indexes is that they can allow your users to "et uch ore business value fro their th eir data by encoura"in" the to interactively

explore it. The excellent perforance that colun stores provide a(es this possible. Iou can "et interactive response tie for queries a"ainst billions of rows on an econoical S58 server with enou"h &'5 to hold your frequently accessed data. olunstore indexes also can reduce the burden on #T and shorten ?TL tie by decreasin" reliance on pre$built suary su ary a""re"ates, whether they are indexed views, user$defined suary tables, or :L'8 cubes. Besi"nin" and aintainin" a""re"ates is often a difficult, labor$intensive tas(. ' sin"le colunstore index can replace do!ens of a""re"ates. olun stores are less le ss brittle than a""re"ates because if a query is chan"ed sli"htly, the colunstore can still support it, whereas a specific a""re"ate ay no lon"er be useful to accelerate the query. @sers who were usin" :L'8 systes only to "et fast query perforance, but who prefer to use the T$SQL lan"ua"e to write queries, ay find they can have one less ovin" part in their environent, reducin" cost and coplexity. @sers who li(e the sophisticated reportin" tools, diensional odelin" capability, forecastin" facilities, and decision$support specific query lan"ua"es that :L'8 tools offer can continue to benefit fro the. 5oreover, they ay now be able to use &:L'8 a"ainst a colunstore$indexed SQL Server data warehouse, and eet or exceed the perforance they were used to in the past with :L'8, but save tie by eliinatin" the cube buildin" process.

Column Store Index

Comments

Content

Sponsor Documents

Recommended