Abstract
PISTON is a portable framework which supports the development of visualization and analysis operators using a platform-independent, data-parallel programming model. Operators such as isosurface, cut-surface and threshold have been implemented in this framework, with the exact same operator code achieving good parallel performance on different architectures. An important analysis operator in cosmology is the halo finder. A halo is a cluster of particles and is considered a common feature of interest found in cosmology data. As the number of cosmological simulations carried out in the recent past has increased, the resultant data of these simulations and the required analysis tasks have increased as well. As a consequence, there is a need to develop scalable and efficient tools to carry out the needed analysis. Therefore, we are currently implementing a halo finder operator using PISTON. Researchers have developed a wide variety of techniques to identify halos in raw particle data. The most basic algorithm is the friend-of-friends (FOF) halo finder, where the particles are clustered based on two parameters: linking length and halo size. In a FOF halo finder, all particles which lie within the linking length are considered as one halo and the halos are filtered based on the halo size parameter. A naive implementation of a FOF halo finder compares each and every particle pair, requiring O(n2) operations. Our dataparallel halo finder operator uses a balanced k-d tree to reduce this number of operations in the average case, and implements the algorithm using only the data-parallel primitives in order to achieve portability and performance.
Data-Parallel Halo Finder Operator in PISTON
Wathsala Widanagamaachchi (CCS-7) University of Utah Mentor : Christopher Sewell
Outline
● ● ● ● ●
PISTON & motivation behind it Data-Parallel programming Halos & Halo finder Naive approach & Data-parallel approach Results
What is PISTON?
● ●
Portable framework Development of visualization & analysis operators Use a platform-independent, data-parallel programming model Motivation
Lack of visualization software which take full advantage of acceleration hardware and multi-core architecture
●
●
Data-Parallel programming & Thrust
●
What is data parallelism?
●
Same operation is performed by different processors on different pieces of data Thrust is a NVidia C++ template library, which provides CUDA and OpenMP backends Most STL algorithms in Thrust are data-parallel
–
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B K-d tree C F G D 0 A, B, C, D, E, F, G
E A
A X rank 1 Y rank 0 Z rank 0
B 0 3 1
D 6 2 2
C 2 6 3
F 5 5 5
EG 4 3 1 4 4 6
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B Split value... 2.5 in X axis C F G D K-d tree 0 A, B, C, D, E, F, G
E A Segment in X axis
A X rank 1 Y rank 0 Z rank 0
B 0 3 1
D 6 2 2
C 2 6 3
F 5 5 5
EG 4 3 1 4 4 6
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B Split value... 2.5 in X axis C F G D 1 A, B, C 2 D, E, F, G K-d tree 0
E A Segment in X axis
A X rank 1 Y rank 0 Z rank 0
B 0 3 1
CD 2 6 6 2 3 2
F 5 5 5
EG 4 3 1 4 4 6
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B Split value... 2.5 in X axis C F G D 1 A, B, C 2 D, E, F, G K-d tree 0
E A Segment in X axis
A X rank 1 Y rank 0 Z rank 0
B 0 1 1
CD 2 3 2 1 2 0
F 2 3 2
EG 1 0 0 2 1 3
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B Split value... 2.5 in Y axis A K-d tree C F G Split value... 3.5 in Y axis D 1 A, B, C 2 D, E, F, G 0
E Segment in Y axis
A X rank 1 Y rank 0 Z rank 0
B 0 1 1
CD 2 3 2 1 2 0
F 2 3 2
EG 1 0 0 2 1 3
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B Split value... 2.5 in Y axis A K-d tree C F G Split value... 3.5 in Y axis D 1 2 0
3 A
4 5 B, C D, E
6 F, G
E Segment in Y axis
A X rank 1 Y rank 0 Z rank 0
B 0 1 1
CD 2 3 2 1 2 0
E 1 0 1
FG 2 0 3 2 2 3
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B Split value... 2.5 in Y axis A K-d tree C F G Split value... 3.5 in Y axis D 1 2 0
3 A
4 5 B, C D, E
6 F, G
E Segment in Y axis
A X rank 0 Y rank 0 Z rank 0
B 0 0 0
CD 1 1 1 1 1 0
E 0 0 1
FG 1 0 1 0 0 1
Balanced k-d tree Creation
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) B K-d tree C F G D 3 4 5 6 1 2 0
E A A
7 8 9 10 11 12 B C D E F G
A X rank 0 Y rank 0 Z rank 0
B 0 0 0
CD 0 0 0 0 0 0
E 0 0 0
FG 0 0 0 0 0 0
At each k-d tree node store parent, child details, segment details & split value
Finding Halos
● ●
Bottom-up approach At each level, consider all nodes in the level
K-d tree A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) 0
1
2
3
4
5
6
A
7 8 9 10 11 12 B C D E F G
Finding Halos
● ●
Bottom-up approach At each level, consider all nodes in the level
●
Look at the split value & segment particles
A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) K-d tree 0
1
2
3
4
5
6
Split value at 0 is 2.5 A
7 8 9 10 11 12 B C D E F G
Finding Halos
● ●
Bottom-up approach At each level, consider all nodes in the level
●
Look at the split value & segment particles
K-d tree A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) 0
●
Determine the particles within the linking length in the split axis
1
2
3
4
5
6
Split value at 0 is 2.5 Linking length 2 A
7 8 9 10 11 12 B C D E F G
Finding Halos
● ●
Bottom-up approach At each level, consider all nodes in the level
●
Look at the split value & segment particles
K-d tree A B C D E F G (1,1,0) (0,4,0) (2,6,0) (8,3,0) (4,2,0) (5,5,0) (3,4,0) 0
●
Determine the particles within the linking length in the split axis
1
2
●
Do m*n comparisons & determine halos Split value at 0 is 2.5 Filter halos
Linking length 2
3
4
5
6
●
A
7 8 9 10 11 12 B C D E F G
Optimization Use of Bounding Boxes
●
Each node has a bounding box calculated by looking at its segment particles Use the BB to reduce the comparisons
K-d tree 0