Kronecker mask#1
Conversation
…10.3.1_branch GraphBLAS 10.3.1
28eacc7 to
2ecc008
Compare
2ecc008 to
3d8c445
Compare
Замеры потребления памяти реализацией Кронекера с маскойРассмотрены три вида входов:
Замеры сделаны с помощью heaptrack. Воспроизвести можно с помощью https://github.com/ilyamaltsev05/experiment 1) Вычисление, эквивалентное mxm в PageRank_demo, с маской из SuiteSparse Matrix Collection.Peak memory consumption
Время исполненияНа 10 запусках
2) Вычисление, эквивалентное mxm в PageRank_demo.Сравнение существующей реализации с применением маски после с новой на примере вычисления произведения Кронекера матрицы m-на-1 и матрицы из единиц 1-на-n. Матрица m-на-1 и маска построены с помощью LAGraph_Random_Matrix. Peak memory consumption некоторых входовМатрица m-на-1 далее vec, маска - mask.
Время исполненияНа 100 запусках
3) Вычисление на некоторых матрицах из SuiteSparse Matrix CollectionPeak memory consumption входов
Время исполненияНа 5 запусках
|
feat: moved implementation to template feat: moved implementation to template
9075dce to
3f98087
Compare
| bm = 4 ; | ||
| bn = 2 ; | ||
|
|
||
| Ax = sparse (100 * sprandn (am,an, 0.5)) ; |
| A.matrix = sprand (5, 10, 0.4) ; | ||
| B.matrix = ones (3, 2) ; | ||
| B.iso = true ; | ||
| M.matrix = sprandn (15, 20,0.2) ~= 0 ; |
|
|
||
| clear C | ||
| C.matrix = Cx ; | ||
| C.matrix = sparse (cm,cn) ; |
There was a problem hiding this comment.
Changed back to Cx since its previous content will be replaced anyway due to default output descriptor field.
| GB_RETURN_IF_QUICK_MASK (C, C_replace, M, Mask_comp, Mask_struct) ; | ||
|
|
||
| // check if it's possible to apply mask immediately in kron | ||
| // TODO: MT should have its own 32/64 bitness controls |
There was a problem hiding this comment.
Completed it so C will have its own controls (still based on number of values in mask) even if mask uses extra 64 bits. However, another iteration over mask is needed beforehand to determine bitness based on number of C values, since C can be sparser than mask. I don't know if it matters often and bitness for C's array differs in many cases. Should I add it and determine bitness by number of C values?
Implementation of non-complemented mask support in Kronecker product, moved it to GB_kron since transposed A and B input matrices are not needed and bitmap A and B are okay.
MT is built as sparse matrix in same CSR/CSC format as mask matrix M. First iteration over mask counts number of values in each vector of MT, then pointers array MTp is passed to GB_cumsum to get prefix sum in MTp. Second iteration over mask is needed to initialize MTi array of coordinates and actual values in MTx. Later MT is transposed and converted to hypersparse if necessary and passed to GB_accum_mask to transplant its result into C.
Two tests (test226 and test227) are modified to check implementation and pass (with malloc debugging turned off as in testall)
UPD: removed mask from final GB_accum_mask call