Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit edf177e

Browse files
Satratdbogunowiczrahul-tulidsikkabfineran
authored
[Cherry Picks] Analyze Bug Fixes (Updated) (#465)
* `RegistryMixin` improved alias management (#404) * initial commit * add docstrings * simplify * hardening * refactor * format registry lookup strings to be lowercases * standardise aliases * Move evaluator registry (#411) * More control over external data size (#412) * When splitting external data, avoid renaming `model.data` to `model.data.1` if only one external data file gets eventually saved (#414) * [model.download] fix function returning nothing (#420) * [BugFix] Path not expanded (#418) * [Fix] Allow for processing Path in the sparsezoo analysis (#417) * Raise TypeError instead of ValueError (#426) * Fix misleading docstring (#416) Add test * add support for benchmark.yaml (#415) * add support for benchmark.yaml recent zoo models use `benchmark.yaml` instead of `benchmarks.yaml`. adding this additional pathway so `benchmark.yaml` is downloaded in the bulk model download * update files filter * fix tests --------- Co-authored-by: dbogunowicz <damian@neuralmagic.com> * [BugFix] Add analyze to init (#421) * Add analyze to init * Move onnxruntime to deps * Print model analysis (#423) * [model.download] fix function returning nothing (#420) * [BugFix] Path not expanded (#418) * print model-analysis * [Fix] Allow for processing Path in the sparsezoo analysis (#417) * add print statement at the end of cli run --------- Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com> Co-authored-by: Rahul Tuli <rahul@neuralmagic.com> Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com> * Omit scalar weight (#424) * ommit scalar weights: * remove unwanted files * comment * Update src/sparsezoo/utils/onnx/analysis.py Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> --------- Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> --------- Co-authored-by: George <george@neuralmagic.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com> Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com> Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> * update analyze help message for correctness (#432) * initial commit (#430) * [sparsezoo.analyze] Fix pathway such that it works for larger models (#437) * fix analyze to work with larger models * update for failing tests; add comments * Update src/sparsezoo/utils/onnx/external_data.py Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com> --------- Co-authored-by: Dipika Sikka <dipikasikka1@gmail.coom> Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com> * Delete hehe.py (#439) * Download deployment dir for llms (#435) * Download deployment dir for llms * Use path instead of download * only set save_as_external_data to true if the model originally had external data (#442) * Add Channel Wise Quantization Support (#441) * Chunk download (#429) * chunk download, break down into 10 * lint * threads download * draft * chunk download draft * job based download and combining/deleteing chunks * delete old code * lint * fix num jobs if file_size is less than the chunk size * doc string and return types * test * lint * fix type hints (#445) * fix bug if the value is a dict (#447) * [deepsparse.analyze] Fix v1 functionality to work with llms (#451) * fix equivalent changes made to analyze_v2 such that inference session works for llms; update wanrings to be debug printouts * typo * overwrite file (#450) Co-authored-by: 21 <a21@21s-MacBook-Pro.local> * Adds a `numpy_array_representer` to yaml (#454) on runtime, to avoid serialization issues * Avoid division by zero (#457) Avoid log of zero * op analysis total counts had double sparse counts (#461) * Rename legacy analyze to analyze_v1 (#459) * Fixing Quant % Calcuation (#462) * initial fix * style * Include Sparsity in Size Calculation (#463) * initial fix * style * incorporate sparsity into size calculation * quality * op analysis total counts had double sparse counts (#461) * Fixing Quant % Calcuation (#462) * initial fix * style * Include Sparsity in Size Calculation (#463) * initial fix * style * incorporate sparsity into size calculation * quality * Revert "Merge branch 'main' into analyze_cherry_picks" This reverts commit 509fa1a, reversing changes made to 08f94c4. --------- Co-authored-by: dbogunowicz <97082108+dbogunowicz@users.noreply.github.com> Co-authored-by: Rahul Tuli <rahul@neuralmagic.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com> Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> Co-authored-by: dbogunowicz <damian@neuralmagic.com> Co-authored-by: George <george@neuralmagic.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.coom> Co-authored-by: 21 <a21@21s-MacBook-Pro.local>
1 parent 44b7972 commit edf177e

File tree

6 files changed

+58
-39
lines changed

6 files changed

+58
-39
lines changed

src/sparsezoo/analyze_v2/memory_access_analysis.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ def get_quantization(self) -> List["QuantizationAnalysisSchema"]:
7373
:returns: List of quantization analysis pydantic models for each grouping
7474
if the node has weights
7575
"""
76-
data = get_memeory_access_bits(self.model_graph, self.node, self.node_shape)
76+
data = get_memory_access_bits(self.model_graph, self.node, self.node_shape)
7777
if data is not None:
7878
quantization_analysis_model = []
7979
for grouping, counts_dict in data.items():
@@ -152,7 +152,7 @@ def get_memory_access_counts(
152152
}
153153

154154

155-
def get_memeory_access_bits(
155+
def get_memory_access_bits(
156156
model_graph: ONNXGraph,
157157
node: NodeProto,
158158
node_shape: Dict,
@@ -164,12 +164,15 @@ def get_memeory_access_bits(
164164
)
165165
node_weight = get_node_weight(model_graph, node)
166166
precision = get_numpy_quantization_level(node_weight)
167-
bits = memory_access_counts["single"]["counts"] * precision
168-
bits_quant = bits * is_quantized_layer(model_graph, node)
167+
counts = memory_access_counts["single"]["counts"]
168+
bits = counts * precision
169+
is_quantized = is_quantized_layer(model_graph, node)
169170

170171
return {
171172
"tensor": {
172173
"bits": bits,
173-
"bits_quant": bits_quant,
174+
"bits_quant": bits * is_quantized,
175+
"counts": counts,
176+
"counts_quant": counts * is_quantized,
174177
}
175178
}

src/sparsezoo/analyze_v2/model_analysis.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -78,10 +78,10 @@ def calculate_sparsity_percentage(self, category: Dict):
7878
counts = category["counts"]
7979
return (counts_sparse / counts) * 100 if counts != 0 else 0
8080

81-
def calculate_quantized_percentage(self, tensor: Dict):
82-
bits_quant = tensor["bits_quant"]
83-
bits = tensor["bits"]
84-
return (bits_quant / bits) * 100 if bits != 0 else 0
81+
def calculate_quantized_percentage(self, tensor: Dict, counts_prefix: str):
82+
counts_quant = tensor[f"{counts_prefix}_quant"]
83+
counts = tensor[counts_prefix]
84+
return (counts_quant / counts) * 100 if counts != 0 else 0
8585

8686
def __repr__(self):
8787
data = self.to_dict()
@@ -93,7 +93,7 @@ def __repr__(self):
9393
)
9494
param_size = summaries["params"]["quantization"]["tensor"]["bits"]
9595
param_quantized = self.calculate_quantized_percentage(
96-
summaries["params"]["quantization"]["tensor"]
96+
summaries["params"]["quantization"]["tensor"], "counts"
9797
)
9898

9999
ops_total = summaries["ops"]["sparsity"]["single"]["counts"]
@@ -102,7 +102,7 @@ def __repr__(self):
102102
)
103103
ops_size = summaries["ops"]["quantization"]["tensor"]["bits"]
104104
ops_quantized = self.calculate_quantized_percentage(
105-
summaries["ops"]["quantization"]["tensor"]
105+
summaries["ops"]["quantization"]["tensor"], "counts"
106106
)
107107

108108
mem_access_total = summaries["mem_access"]["sparsity"]["single"]["counts"]
@@ -111,7 +111,7 @@ def __repr__(self):
111111
)
112112
mem_access_size = summaries["mem_access"]["quantization"]["tensor"]["bits"]
113113
mem_access_quantized = self.calculate_quantized_percentage(
114-
summaries["mem_access"]["quantization"]["tensor"]
114+
summaries["mem_access"]["quantization"]["tensor"], "counts"
115115
)
116116

117117
return (

src/sparsezoo/analyze_v2/operation_analysis.py

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -166,22 +166,23 @@ def get_operation_bits(
166166
precision = get_numpy_quantization_level(node_weight)
167167
is_quantized_op = "32" not in str(precision)
168168

169-
bits = (
170-
ops["single"]["counts"] + ops["single"]["counts_sparse"]
171-
) * precision
172-
173-
bits_block4 = (
174-
ops["block4"]["counts"] + ops["block4"]["counts_sparse"]
175-
) * precision
176-
177-
bits_quant = is_quantized_op * bits
169+
single_counts = ops["single"]["counts"]
170+
single_counts_sparse = ops["single"]["counts_sparse"]
171+
single_bits = (single_counts - single_counts_sparse) * precision
172+
block4_counts = ops["block4"]["counts"]
173+
block4_counts_sparse = ops["block4"]["counts_sparse"]
174+
block4_bits = (block4_counts - block4_counts_sparse) * precision
178175
return {
179176
"tensor": {
180-
"bits": bits,
181-
"bits_quant": bits_quant,
177+
"counts": single_counts,
178+
"counts_quant": is_quantized_op * single_counts,
179+
"bits": single_bits,
180+
"bits_quant": is_quantized_op * single_bits,
182181
},
183182
"block4": {
184-
"bits": bits_block4,
185-
"bits_quant": bits_quant,
183+
"counts": block4_counts,
184+
"counts_quant": is_quantized_op * block4_counts,
185+
"bits": block4_bits,
186+
"bits_quant": is_quantized_op * block4_bits,
186187
},
187188
}

src/sparsezoo/analyze_v2/parameter_analysis.py

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
get_node_num_four_block_zeros_and_size,
3030
get_node_param_counts,
3131
get_node_weight,
32-
get_node_weight_bits,
32+
get_node_weight_precision,
3333
get_numpy_distribution_statistics,
3434
get_numpy_entropy,
3535
get_numpy_modes,
@@ -153,14 +153,17 @@ def get_parameter_bits(
153153
If the layer is quantized, assume all its elements in the ndarray
154154
are quantized
155155
"""
156-
node_weight = get_node_weight(model_graph, node)
157-
if node_weight is not None and node_weight.size > 0:
158-
bits = get_node_weight_bits(model_graph, node)
159-
156+
num_weights, num_bias, num_sparse_weights = get_node_param_counts(node, model_graph)
157+
if num_weights > 0:
158+
precision = get_node_weight_precision(model_graph, node)
159+
is_quantized = is_quantized_layer(model_graph, node)
160+
num_non_sparse_weights = num_weights - num_sparse_weights + num_bias
160161
return {
161162
"tensor": {
162-
"bits": bits,
163-
"bits_quant": bits * is_quantized_layer(model_graph, node),
163+
"counts": num_weights,
164+
"counts_quant": num_weights * is_quantized,
165+
"bits": num_non_sparse_weights * precision,
166+
"bits_quant": num_non_sparse_weights * precision * is_quantized,
164167
},
165168
}
166169

src/sparsezoo/analyze_v2/schemas/quantization_analysis.py

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,14 @@
2020

2121

2222
class QuantizationSummaryAnalysisSchema(BaseModel):
23+
counts: float = Field(..., description="Total number of weights")
24+
counts_quant: int = Field(
25+
...,
26+
description=(
27+
"Total number of quantized weights."
28+
"Here we assume if the layer is quantized, the entire array is quantized"
29+
),
30+
)
2331
bits: float = Field(..., description="Total bits required to store the weights")
2432
bits_quant: int = Field(
2533
...,
@@ -39,9 +47,9 @@ def validate_types(cls, value):
3947
@validator("percent", pre=True, always=True)
4048
def calculate_percent_if_none(cls, value, values):
4149
if value is None:
42-
bits = values.get("bits", 0)
43-
bits_quant = values.get("bits_quant", 0)
44-
return bits_quant / bits if bits > 0 else 0.0
50+
counts = values.get("counts", 0)
51+
counts_quant = values.get("counts_quant", 0)
52+
return counts_quant / counts if counts > 0 else 0.0
4553
return value
4654

4755
def __add__(self, model: BaseModel):
@@ -51,7 +59,9 @@ def __add__(self, model: BaseModel):
5159

5260
if validator_model is not None:
5361
return validator_model(
62+
counts=self.counts + model.counts,
5463
bits=self.bits + model.bits,
64+
counts_quant=self.counts_quant + model.counts_quant,
5565
bits_quant=self.bits_quant + model.bits_quant,
5666
)
5767

@@ -67,6 +77,8 @@ def __add__(self, model: BaseModel):
6777
if validator_model is not None and self.grouping == model.grouping:
6878
return validator_model(
6979
grouping=self.grouping,
80+
counts=self.counts + model.counts,
7081
bits=self.bits + model.bits,
82+
counts_quant=self.counts_quant + model.counts_quant,
7183
bits_quant=self.bits_quant + model.bits_quant,
7284
)

src/sparsezoo/utils/onnx/analysis.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@
4848
"get_numpy_distribution_statistics",
4949
"get_numpy_quantization_level",
5050
"get_numpy_bits",
51-
"get_node_weight_bits",
51+
"get_node_weight_precision",
5252
"get_node_param_counts",
5353
"get_node_kernel_shape",
5454
]
@@ -485,13 +485,13 @@ def get_node_param_counts(
485485
return params, bias, sparse_params
486486

487487

488-
def get_node_weight_bits(
488+
def get_node_weight_precision(
489489
model_graph: ONNXGraph,
490490
node: NodeProto,
491491
) -> int:
492-
"""Get the bits needed to store the node weights"""
492+
"""Get the precision of the node in number of bits"""
493493
node_weight = get_node_weight(model_graph, node)
494-
return get_numpy_bits(node_weight)
494+
return get_numpy_quantization_level(node_weight)
495495

496496

497497
def get_numpy_bits(arr: numpy.ndarray) -> int:

0 commit comments

Comments
 (0)