add snapshot create delete capability#111
add snapshot create delete capability#111disperate wants to merge 42 commits intocloudscale-ch:masterfrom
Conversation
…dd-snapshot-create-delete-capability
mweibel
left a comment
There was a problem hiding this comment.
looks good so far. I have a few questions and mostly nits.
For reviewers sake: it would be great to have an example in the examples folder ready to use for testing this. I currently didn't test it on a cluster although I did install the version to see if it starts and if we have any immediate error logs (we don't).
Co-authored-by: Michael Weibel <307427+mweibel@users.noreply.github.com>
mweibel
left a comment
There was a problem hiding this comment.
overall this looks very good. Most of the things I commented are not super critical things. Good work 👏
| - apiGroups: ["snapshot.storage.k8s.io"] | ||
| resources: ["volumesnapshots"] | ||
| verbs: [ "get", "list", "watch", "update" ] | ||
| - apiGroups: ["snapshot.storage.k8s.io"] | ||
| resources: ["volumesnapshotcontents"] | ||
| verbs: ["get", "list"] |
There was a problem hiding this comment.
maybe we misunderstood each other in the first review.
I'd keep these in, because external-provisioner has them as well.
In the end it doesn't really matter because snapshotter-role also maps those to the same SA but I think updating later to newer provisioner versions with potentially updated role definitions makes the update easier.
There was a problem hiding this comment.
what's the reason for this to not be in the chart?
driver/controller.go
Outdated
| AccessibleTopology: []*csi.Topology{ | ||
| { | ||
| Segments: map[string]string{ | ||
| topologyZonePrefix: d.zone, |
There was a problem hiding this comment.
| topologyZonePrefix: d.zone, | |
| topologyZonePrefix: snapshot.Zone.Slug, |
shouldn't we use snapshot.Zone.Slug instead?
There was a problem hiding this comment.
The d.zone is not entirely incorrect. Since the driver does not currently work across zones, it gets the zone from the metadata of the node on which it is running. The same zone logic is happening during the normal volume creation.
I think we will move to multi-zone support at some point, for which we will need to implement proper topology support. So i decided to refactor this whole section a bit. We now use the actual data from the volume to create the csiVolume.
| assert.NoError(t, err) | ||
|
|
||
| // Wait a bit for the PVC to be processed | ||
| time.Sleep(10 * time.Second) |
There was a problem hiding this comment.
I'd prefer if we could fetch the PVC more often and in a loop, but overall wait a bit longer. 10s is quite a long time in tests and even then may fail, so using a loop may result in faster tests and potentially less flakes.
| assert.NoError(t, err) | ||
|
|
||
| // Wait a bit for the PVC to be processed | ||
| time.Sleep(10 * time.Second) |
|
|
||
| if time.Now().UnixNano()-start.UnixNano() > (5 * time.Minute).Nanoseconds() { | ||
| t.Fatalf("timeout exceeded while waiting for volume snapshot %v to be ready", name) | ||
| return |
There was a problem hiding this comment.
nit: unnecessary, but can be left if it's preferred clarity-wise.
t.Fatalf already exits right on the spot.
| t.Logf("Volume snapshot %q not ready yet; waiting...", name) | ||
| time.Sleep(5 * time.Second) | ||
| } | ||
| } |
There was a problem hiding this comment.
this code is a bit complex and could be simplified and made more idiomatic Go.
Several suggestions:
- use
time.Since():
func waitForVolumeSnapshot(t *testing.T, client kubernetes.Interface, name string) {
const timeout = 5 * time.Minute
const pollInterval = 5 * time.Second
start := time.Now()
t.Logf("Waiting for volume snapshot %q to be ready...", name)
for {
snapshot := getVolumeSnapshot(t, client, name)
if snapshot.Status != nil && snapshot.Status.ReadyToUse != nil && *snapshot.Status.ReadyToUse {
t.Logf("Volume snapshot %q is ready", name)
return
}
if time.Since(start) > timeout {
t.Fatalf("timeout exceeded while waiting for volume snapshot %q to be ready", name)
}
t.Logf("Volume snapshot %q not ready yet; waiting...", name)
time.Sleep(pollInterval)
}
} - use
tickerandselect:
func waitForVolumeSnapshot(ctx context.Context, t *testing.T, name string) {
const pollInterval = 5 * time.Second
t.Logf("Waiting for volume snapshot %q to be ready...", name)
ticker := time.NewTicker(pollInterval)
defer ticker.Stop()
for {
snapshot := getVolumeSnapshot(t, ctx, name)
if snapshot.Status != nil && snapshot.Status.ReadyToUse != nil && *snapshot.Status.ReadyToUse {
t.Logf("Volume snapshot %q is ready", name)
return
}
select {
case <-ctx.Done():
t.Fatalf("timeout waiting for volume snapshot %q: %v", name, ctx.Err())
case <-ticker.C:
t.Logf("Volume snapshot %q not ready yet; waiting...", name)
}
}
} func waitForVolumeSnapshot(ctx context.Context, t *testing.T, name string) {
t.Logf("Waiting for volume snapshot %q to be ready...", name)
err := wait.PollUntilContextTimeout(ctx, 5*time.Second, 5*time.Minute, true, func(ctx context.Context) (done bool, err error) {
snapshot := getVolumeSnapshot(t, name)
if snapshot.Status != nil && snapshot.Status.ReadyToUse != nil && *snapshot.Status.ReadyToUse {
t.Logf("Volume snapshot %q is ready", name)
return true, nil
}
t.Logf("Volume snapshot %q not ready yet; waiting...", name)
return false, nil
})
if err != nil {
t.Fatalf("failed waiting for volume snapshot %q: %v", name, err)
}
}personally, I'd use option 3 in this case because we're already within a package importing kubernetes code.
ctx in the caller you get by using t.Context().
Co-authored-by: Michael Weibel <307427+mweibel@users.noreply.github.com>
Co-authored-by: Michael Weibel <307427+mweibel@users.noreply.github.com>
Co-authored-by: Michael Weibel <307427+mweibel@users.noreply.github.com>
….com:disperate/csi-cloudscale into julian/add-snapshot-create-delete-capability
… TestPod_Create_Volume_From_Snapshot, remove unused client param
Adds support for
ControllerServiceCapability_RPC_CREATE_DELETE_SNAPSHOT.