Column Store Index

Published on 2 weeks ago | Categories: Documents | Downloads: 1 | Comments: 0 | Views: 76
of 7
Download PDF   Embed   Report

Comments

Content

 

Introduction The SQL Server 2012 introduces a new data warehouse query que ry acceleration feature based on a new type of index called the

columnstore.

This new index, cobined

with enhanced query optii!ation and execution e xecution features, iproves data warehouse query perforance by hundreds to thousands of ties in soe cases, and can routinely "ive a tenfold speedup for a broad ran"e of queries fittin" the scenario for which it was desi"ned. de si"ned. #t does all this within the failiar T$SQL query lan"ua"e, and the pro"rain" and syste ana"eent environent of SQL Server.. #t%s thus fully copatible with all reportin" solutions that run as clients of Server SQL Server, includin" SQL Server &eportin" Services. ' colunstore index stores each colun in a separate set of dis( pa"es, rather than storin" ultiple rows per pa"e as data traditionally has been stored. )e use the ter *row store+ to describe either a heap or a $tree that contains ultiple rows per pa"e. The difference between colun store and row store approaches is illustrated belowFigure

 

The coluns 1/ are stored in different "roups of pa"es in the colunstore index. enefits of this are-



only the coluns needed to solve a query are fetched fro dis( this is often fewer than 13 of the coluns in a typical fact table4,



it%s easier to copress the data due to the redundancy of data within a colun, and

 



buffer hit rates are iproved because data is hi"hly copressed, and frequently accessed parts of coonly used coluns reain in eory, while infrequently used parts are pa"ed out.

The colunstore index in SQL Server eploys 5icrosoft%s patented 6ertipaq7 6ertipaq7 technolo"y, which it shares with SQL Server 'nalysis Services and 8ower8ivot. SQL Server colunstore indexes don%t have h ave to fit in ain eory, but they can effectively use as uch eory as is available on the server. server. 8ortions 8ortions of coluns are oved in and out of eory on deand. SQL Server colunstore indexes are *pure+ colun stores, not a hybrid, because they store all data for separate coluns on separate pa"es. This iproves #9: scan perforance and buffer hit rates. SQL Server is the first a;or database product to support a pure colunstore index.

Using Columnstore Indexes To iprove query perforance, all you need to do is build a colunstore index on the fact tables in a data warehouse. #f you have extreely lar"e diensions say ore than 10 illion rows4 then you ay wish to build a colunstore index on those diensions as well. 'fter that, you siply subit queries to SQL Server, and they can run uch, uch faster. <or exaple, The catalo"=sales fact table in this database database contains 1.>> billion billion rows. The followin" stateent was used to create a colunstore index that includes all the coluns of the table&?'T? :[email protected]:&? #AB?C cstore on DdboE.Dcatalo"=salesE Dcs=sold=date=s(E ,Dcs=sold=tie=s(E ,Dcs=ship=date=s(E ,Dcs=bill=custoer=s(E ,Dcs=bill=cdeo=s(E ,Dcs=bill=hdeo=s(E ,Dcs=bill=addr=s(E

 

,Dcs=ship=custoer=s(E ,Dcs=ship=cdeo=s(E ,Dcs=ship=hdeo=s(E ,Dcs=ship=addr=s(E ,Dcs=call=center=s(E ,Dcs=catalo"=pa"e=s(E ,Dcs=ship=ode=s(E ,Dcs=warehouse=s(E ,Dcs=ite=s(E ,Dcs=proo=s(E ,Dcs=order=nuberE ,Dcs=quantityE ,Dcs=wholesale=costE ,Dcs=list=priceE ,Dcs=sales=priceE ,Dcs=ext=discount=atE ,Dcs=ext=sales=priceE ,Dcs=ext=wholesale=costE ,Dcs=ext=list=priceE ,Dcs=ext=taxE ,Dcs=coupon=atE ,Dcs=ext=ship=costE ,Dcs=net=paidE ,Dcs=net=paid=inc=taxE ,Dcs=net=paid=inc=shipE ,Dcs=net=paid=inc=ship=taxE ,Dcs=net=profitE4

Performance Characteristics olunstore index query processin" is ost heavily optii!ed for star ;oin queries, but any types of queries can benefit. <act$to$fact <act$to$fact table ;oins and ulti$colun  ;oin queries ay benefit less fro colunstore indexes, or not a tall. :LT8$style :LT8$style queries, includin" point loo(ups, and fetches of every colun of a wide row, will usually not perfor as well with a colunstore index as with a $tree index.

 

olunstore indexes don%t always iprove data warehouse query perforance. )hen they don%t, norally, the query optii!er will choose to use a heap or o r $tree to access the data. #f the optii!er chooses the colunstore index when in fact usin" the underlyin" heap or $tree perfors better for a query, the developer can use hints to tune the query to use the heap or $tree instead.

Loading Data

The tables with colunstore indexes can%t can% t be updated directly usin" #AS?&T, #AS?&T, @8B'T?, @8B'T ?, B?L?T?, and 5?&F? stateents, or bul( load operations. To ove data into a colunstore table you can switch in a partition, or disable the colunstore index, update the table, and rebuild the index. olunstore indexes on partitioned tables ust be partition$ali"ned. 5ost data warehouse custoers have a daily load cycle, and treat the data warehouse as read$only durin" the day, so they%ll alost certainly be able to use colunstore indexes. ' second liitation is that colunstore indexes are nonclustered indexes, so they still require the ain table, which could be either a clustered index or a heap. This ainly eans that youGll end up with two copies of the sae data. 5icrosoft has said that this liitation will "o away in a future release of SQL Server, Server, which which will have a colunstore index as the ain table. <inally, soe data types arenGt allowed. 'ccordin" to SQL Server 2012 &0 oo(s :nline :L4, the followin" data types canGt be used in a colunstore index•

binary and varbinary



ntext, text, and ia"e



varcharax4 and nvarcharax4



uniqueidentifier

 



rowversion and tiestap4



sql=variant



decial and nueric4 with precision "reater than 1H di"its



datetieoffset with scale "reater than 2



L& types hierarchyid and spatial types4



xl

Iou can also create a view that uses @A#:A 'LL to cobine a table with a colun store index and an updatable table without a colunstore index into one lo"ical table. This view can then be referenced by queries. This allows dynaic insertion of  new data into a sin"le lo"ical fact table while still retainin" uch of the perforance benefit of colunstore capability. 'll tables that don%t have colunstore indexes reain fully updateable. This allows you to, for exaple, create a diension table on the fly and then use it in successive queries by ;oinin" it to the colun store$structured fact table. This can be useful, for exaple, when a retail analyst wants to put, say, about 1000 products into a study "roup, and then run repeated queries for that study "roup. The #Bs of these products can be placed into a study "roup diension table. This table can then be ;oined to the colunstore$structured fact table. #ndex build ties for a colunstore index have been observed to be 2 to J ties lon"er than the tie to build a clustered $tree index on the sae data, on a pre$ release build.

Benefits of Columnstore Indexes The priary benefit of colunstore indexes is that they can allow your users to "et uch ore business value fro their th eir data by encoura"in" the to interactively

 

explore it. The excellent perforance that colun stores provide a(es this possible. Iou can "et interactive response tie for queries a"ainst billions of rows on an econoical S58 server with enou"h &'5 to hold your frequently accessed data. olunstore indexes also can reduce the burden on #T and shorten ?TL tie by decreasin" reliance on pre$built suary su ary a""re"ates, whether they are indexed views, user$defined suary tables, or :L'8 cubes. Besi"nin" and aintainin" a""re"ates is often a difficult, labor$intensive tas(. ' sin"le colunstore index can replace do!ens of a""re"ates. olun stores are less le ss brittle than a""re"ates because if a query is chan"ed sli"htly, the colunstore can still support it, whereas a specific a""re"ate ay no lon"er be useful to accelerate the query. @sers who were usin" :L'8 systes only to "et fast query perforance, but who prefer to use the T$SQL lan"ua"e to write queries, ay find they can have one less ovin" part in their environent, reducin" cost and coplexity. @sers who li(e the sophisticated reportin" tools, diensional odelin" capability, forecastin" facilities, and decision$support specific query lan"ua"es that :L'8 tools offer can continue to benefit fro the. 5oreover, they ay now be able to use &:L'8 a"ainst a colunstore$indexed SQL Server data warehouse, and eet or exceed the perforance they were used to in the past with :L'8, but save tie by eliinatin" the cube buildin" process.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close