Parallelize support bundle collection and add Neo4j diagnostics by josho-sysdig · Pull Request #246 · draios/sysdig-cloud-scripts

josho-sysdig · 2026-04-30T15:26:19Z

Refactor get_support_bundle.sh to dispatch independent collection tasks as concurrent background jobs, reducing wall time on representative clusters (~40 pods, ~120 containers) from ~10m 36s to ~3m 32s with the default MAX_JOBS=6 setting.

Parallelization changes:

Introduce three job-control helpers: run_bg (fork+register), throttle (cap concurrency via wait -n / sleep 0.1 fallback for Bash <4.3), and wait_all (harvest PIDs, report failures, clean up temp output files)
Add MAX_JOBS variable (default 6, overridable via --max-jobs flag or environment variable) to prevent API server overload
Refactor container log + support-file collection into collect_container_logs(), launched one background job per container
Refactor per-node JSON manifest collection into collect_node_info(), launched one background job per node
Refactor per-resource-type manifest collection into collect_resource_manifests(), launched one background job per type
Refactor Cassandra stats/storage, Elasticsearch stats, PostgreSQL, MySQL, Kafka, and Zookeeper storage into dedicated functions, each dispatched as a background job
Move kubectl cluster-info dump to a background job so it runs concurrently with log collection
Directory creation and pod/node/container discovery remain serial to avoid race conditions

New collection: Neo4j diagnostics

Add collect_neo4j_stats() to retrieve cluster server and database status via cypher-shell, reading the ingestion_admin password from the neo4jdb-user-secrets or neo4j-user-secrets Kubernetes secret
Outputs neo4j//cypher_show_servers.txt and neo4j//cypher_show_databases.txt; writes a skip log if the secret is unavailable

README.md rewrite:

Replace minimal usage stub with full man-page-style documentation covering NAME, SYNOPSIS, DESCRIPTION, OPTIONS, ENVIRONMENT, OUTPUT (annotated directory tree), EXAMPLES, and PARALLEL PROCESSING sections
Document all flags including the new --max-jobs option
Add neo4j/ output directory to the archive structure reference
Include concurrency control design notes and performance benchmark table

Refactor get_support_bundle.sh to dispatch independent collection tasks as concurrent background jobs, reducing wall time on representative clusters (~40 pods, ~120 containers) from ~10m 36s to ~3m 32s with the default MAX_JOBS=6 setting. Parallelization changes: - Introduce three job-control helpers: run_bg (fork+register), throttle (cap concurrency via wait -n / sleep 0.1 fallback for Bash <4.3), and wait_all (harvest PIDs, report failures, clean up temp output files) - Add MAX_JOBS variable (default 6, overridable via --max-jobs flag or environment variable) to prevent API server overload - Refactor container log + support-file collection into collect_container_logs(), launched one background job per container - Refactor per-node JSON manifest collection into collect_node_info(), launched one background job per node - Refactor per-resource-type manifest collection into collect_resource_manifests(), launched one background job per type - Refactor Cassandra stats/storage, Elasticsearch stats, PostgreSQL, MySQL, Kafka, and Zookeeper storage into dedicated functions, each dispatched as a background job - Move kubectl cluster-info dump to a background job so it runs concurrently with log collection - Directory creation and pod/node/container discovery remain serial to avoid race conditions New collection: Neo4j diagnostics - Add collect_neo4j_stats() to retrieve cluster server and database status via cypher-shell, reading the ingestion_admin password from the neo4jdb-user-secrets or neo4j-user-secrets Kubernetes secret - Outputs neo4j/<pod>/cypher_show_servers.txt and neo4j/<pod>/cypher_show_databases.txt; writes a skip log if the secret is unavailable README.md rewrite: - Replace minimal usage stub with full man-page-style documentation covering NAME, SYNOPSIS, DESCRIPTION, OPTIONS, ENVIRONMENT, OUTPUT (annotated directory tree), EXAMPLES, and PARALLEL PROCESSING sections - Document all flags including the new --max-jobs option - Add neo4j/ output directory to the archive structure reference - Include concurrency control design notes and performance benchmark table

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize support bundle collection and add Neo4j diagnostics#246

Parallelize support bundle collection and add Neo4j diagnostics#246
josho-sysdig wants to merge 1 commit intomasterfrom
support_bundle_parallel

josho-sysdig commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

josho-sysdig commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant