Skip to content

HDDS-14897. Add multiple S3 gateways to the rolling-upgrade suite#10028

Open
dombizita wants to merge 3 commits intoapache:HDDS-14496-zdufrom
dombizita:HDDS-14897
Open

HDDS-14897. Add multiple S3 gateways to the rolling-upgrade suite#10028
dombizita wants to merge 3 commits intoapache:HDDS-14496-zdufrom
dombizita:HDDS-14897

Conversation

@dombizita
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Use HA Proxy to load balance multiple S3 gateways. I did the necessary changes in docker-compose.yaml and adjusted the shell scripts for that. I didn't use the existing s3-haproxy.yaml, as the one in common was not working out of the box with the Ozone HA setup (also found HDDS-14956). As this suite always need to have multiple S3 gateways I think it's okay to have it in the docker-compose.yaml.

One outstanding change is in the hadoop-ozone/dist/src/main/compose/testlib.sh. Without that change I faced this error:

OCI runtime exec failed: exec failed: unable to start container process: exec: "bash": executable file not found in $PATH

Cursor help: "This is from reorder_om_nodes in testlib.sh. It iterates over ALL containers and runs docker exec ... bash -c "...". The HAProxy container (ha-s3g-1) uses haproxy:lts-alpine — Alpine Linux — which only has sh, not bash."

This is new, as Ozone HA suite never used S3 HAProxy setup before and if it's not Ozone HA we are not calling reorder_om_nodes. This fix will simply skip it and as the ha proxy container doesn't need ozone-site.xml, it's safe to do this. The downside is it would also silently swallow genuine bash failures. Another solution is to use sh instead of bash

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14897

How was this patch tested?

CI with the rolling upgrade test suite: https://github.com/dombizita/ozone/actions/runs/23846523428
With commenting out (current state on HDDS-14496-zdu): https://github.com/dombizita/ozone/actions/runs/23846562903

--- RESTARTING s3g1 WITH IMAGE 2.2.0 ---
Using Docker Compose v2
==============================================================================
2.2.0-2.2.0-2-s3g1-generate-generate-s3g1 :: Generate data                    
==============================================================================
Create a volume and bucket                                            | PASS |
------------------------------------------------------------------------------
Create key                                                            | PASS |
------------------------------------------------------------------------------
Create a bucket in s3v volume                                         | PASS |
------------------------------------------------------------------------------
Create key in the bucket in s3v volume                                | PASS |
------------------------------------------------------------------------------
Try to create a bucket using S3 API                                   | PASS |
------------------------------------------------------------------------------
Create key using S3 API                                               | PASS |
------------------------------------------------------------------------------
2.2.0-2.2.0-2-s3g1-generate-generate-s3g1 :: Generate data            | PASS |
6 tests, 6 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/upgrade/result/robot-2.2.0-2.2.0-2-s3g1-001.xml
Using Docker Compose v2
==============================================================================
2.2.0-2.2.0-2-s3g1-validate-generate-s3g1 :: Smoketest ozone cluster startup  
==============================================================================
Read data from previously created key                                 | PASS |
------------------------------------------------------------------------------
Read key created with Ozone Shell using S3 API                        | PASS |
------------------------------------------------------------------------------
Read key created with S3 API using S3 API                             | PASS |
------------------------------------------------------------------------------
2.2.0-2.2.0-2-s3g1-validate-generate-s3g1 :: Smoketest ozone clust... | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/upgrade/result/robot-2.2.0-2.2.0-2-s3g1-002.xml
--- RESTARTING s3g2 WITH IMAGE 2.2.0 ---
Using Docker Compose v2
==============================================================================
2.2.0-2.2.0-2-s3g2-generate-generate-s3g2 :: Generate data                    
==============================================================================
Create a volume and bucket                                            | PASS |
------------------------------------------------------------------------------
Create key                                                            | PASS |
------------------------------------------------------------------------------
Create a bucket in s3v volume                                         | PASS |
------------------------------------------------------------------------------
Create key in the bucket in s3v volume                                | PASS |
------------------------------------------------------------------------------
Try to create a bucket using S3 API                                   | PASS |
------------------------------------------------------------------------------
Create key using S3 API                                               | PASS |
------------------------------------------------------------------------------
2.2.0-2.2.0-2-s3g2-generate-generate-s3g2 :: Generate data            | PASS |
6 tests, 6 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/upgrade/result/robot-2.2.0-2.2.0-2-s3g2-001.xml
Using Docker Compose v2
==============================================================================
2.2.0-2.2.0-2-s3g2-validate-generate-s3g2 :: Smoketest ozone cluster startup  
==============================================================================
Read data from previously created key                                 | PASS |
------------------------------------------------------------------------------
Read key created with Ozone Shell using S3 API                        | PASS |
------------------------------------------------------------------------------
Read key created with S3 API using S3 API                             | PASS |
------------------------------------------------------------------------------
2.2.0-2.2.0-2-s3g2-validate-generate-s3g2 :: Smoketest ozone clust... | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/upgrade/result/robot-2.2.0-2.2.0-2-s3g2-002.xml
--- RESTARTING s3g3 WITH IMAGE 2.2.0 ---
Using Docker Compose v2
==============================================================================
2.2.0-2.2.0-2-s3g3-generate-generate-s3g3 :: Generate data                    
==============================================================================
Create a volume and bucket                                            | PASS |
------------------------------------------------------------------------------
Create key                                                            | PASS |
------------------------------------------------------------------------------
Create a bucket in s3v volume                                         | PASS |
------------------------------------------------------------------------------
Create key in the bucket in s3v volume                                | PASS |
------------------------------------------------------------------------------
Try to create a bucket using S3 API                                   | PASS |
------------------------------------------------------------------------------
Create key using S3 API                                               | PASS |
------------------------------------------------------------------------------
2.2.0-2.2.0-2-s3g3-generate-generate-s3g3 :: Generate data            | PASS |
6 tests, 6 passed, 0 failed
==============================================================================
Output:  /tmp/smoketest/upgrade/result/robot-2.2.0-2.2.0-2-s3g3-001.xml
Using Docker Compose v2
==============================================================================
2.2.0-2.2.0-2-s3g3-validate-generate-s3g3 :: Smoketest ozone cluster startup  
==============================================================================
Read data from previously created key                                 | PASS |
------------------------------------------------------------------------------
Read key created with Ozone Shell using S3 API                        | PASS |
------------------------------------------------------------------------------
Read key created with S3 API using S3 API                             | PASS |
------------------------------------------------------------------------------
2.2.0-2.2.0-2-s3g3-validate-generate-s3g3 :: Smoketest ozone clust... | PASS |
3 tests, 3 passed, 0 failed

@dombizita dombizita requested review from adoroszlai and errose28 April 2, 2026 05:48
@github-actions github-actions bot added the zdu Pull requests for Zero Downtime Upgrade (ZDU) https://issues.apache.org/jira/browse/HDDS-14496 label Apr 2, 2026
Copy link
Copy Markdown
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dombizita, LGTM.

export SECURITY_ENABLED="true"

create_data_dirs dn{1..5} kms om{1..3} recon s3g scm{1..3}
create_data_dirs dn{1..5} kms om{1..3} recon s3g s3g{1..3} scm{1..3}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s3g data dir is no longer used, but it's harmless, no need to update patch just for this.

sed -i -e 's/om1,om2,om3/${new_order}/' /etc/hadoop/ozone-site.xml; \
echo 'Replaced OM order with ${new_order} in ${c}'; \
fi"
fi" || true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

silently swallow genuine bash failures

This is fine. In the worst case OM client will contact follower first with the original order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

zdu Pull requests for Zero Downtime Upgrade (ZDU) https://issues.apache.org/jira/browse/HDDS-14496

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants