-
Notifications
You must be signed in to change notification settings - Fork 2
SSH key manager #290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SSH key manager #290
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consider treating .ssh as read-only as much as possible. Keys generated by fileglancer could be stored elsewhere.
The main file that may need to be written to here might be .ssh/authorized_keys . We should consider if .ssh/authorized_keys2 could be used or if the sshd configuration could be modified to recognize an authorized_keys file exclusively managed by fileglancer.
Here is the documentation for OpenSSH sshd:
https://man.openbsd.org/sshd_config#AuthorizedKeysFile
|
I added a |
|
@mkitti As discussed, we agree about "treating .ssh as read-only as much as possible". To minimize complexity and because for now this feature has a very small user base, I've modified the GUI to do the most minimal thing possible to |
|
In a few places we still read the private key into resident memory of the Python process as an immutable Python string. This is not good. Core dumps of the Python process may contain the contents of the private key. Rather than loading file contents into an immutable string, we should read the contents into a mutable import os
key_path = "private_key.pem"
file_size = os.path.getsize(key_path)
# 1. Pre-allocate mutable memory
sensitive_data = bytearray(file_size)
# 2. Read directly into the pre-allocated buffer
with open(key_path, "rb") as f:
f.readinto(sensitive_data)
# Use the bytes as needed
# 3. Securely overwrite the memory with zeros
for i in range(len(sensitive_data)):
sensitive_data[i] = 0
# 4. Now safe to remove the reference
del sensitive_dataReferences:
|
mkitti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a bytearray and perhaps a context manager to minimize time that a private key spends in resident memory of the Python process.
|
|
||
| class SSHKeyContent(BaseModel): | ||
| """SSH key content - only fetched on demand""" | ||
| key: str = Field(description="The requested key content") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making the private key contents a mutable bytearray that we can explicitly overwrite to clear from memory. Avoid converting the bytearray to an immutable string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could make SSHKeyContent follow the context manager protocol so we can use a with statement. On __exit__ the context manager would then make sure to shred the key contents in memory by zeroing out the resident bytearray. You could use contextlib to help with this.
I also suggest centralizing the the potentially sensitive reading implementation here.
- The class would only store a string or Path containing the location of the key file.
- The context manager
__enter__would read the file contents into abytearrayviareadintoand then return thatbytearray - The context manager
__exit__would then explicitly overwrite thebytearray.
This will minimize the time the any sensitive secret spends in resident memory.
|
|
||
| class GenerateKeyRequest(BaseModel): | ||
| """Request body for generating an SSH key""" | ||
| passphrase: Optional[str] = Field(default=None, description="Optional passphrase to protect the private key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making the passphrase a mutable bytearray that we can explicitly overwrite to clear from memory. Avoid converting the bytearray to an immutable string.
fileglancer/sshkeys.py
Outdated
| if not os.path.exists(private_key_path): | ||
| raise ValueError(f"Private key '{filename}' not found") | ||
| with open(private_key_path, 'r') as f: | ||
| return SSHKeyContent(key=f.read()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use readinto to read into a mutable bytearray.
| with _get_user_context(username): | ||
| try: | ||
| ssh_dir = sshkeys.get_ssh_directory() | ||
| return sshkeys.get_key_content(ssh_dir, "id_ed25519", key_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shred the key contents in a bytearray after usage. Ensure the contents get shreded even if an error occurs. Consider using a context manager here.
|
|
||
| # Append the key | ||
| with open(authorized_keys_path, 'a') as f: | ||
| f.write(public_key + '\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a restrict keyword here? Does Seqera need an interactive terminal or port forwarding capabilities?
https://manpages.debian.org/experimental/openssh-server/authorized_keys.5.en.html#restrict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could also restrict what domain names or IP addresses this key can be used from?
https://manpages.debian.org/experimental/openssh-server/authorized_keys.5.en.html#from=_pattern-list_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end could we add a comment as the last space delimited field that says this was added by fileglancer for seqera?
|
If I understand correctly, the exclusive user of the private key should be Seqera alone. The private key does not need to be used by the user for any other purpose. Why do we need to store the private or public key to disk at all other than adding it to the authorized_keys file? Rather we only need to transiently show the private key to the user via the web interface to copy into Seqera. If they fail to do so, then we just generate another private key anew the next time. That process would work more like how Github personal access tokens work. Take a look at how the user interface works for tokens.
If you push the "Generate a new token" button and fill out the form, it will show you the token once in this example (I deleted the key before posting the screenshot):
It warns that you will not be able to see the personal access token again. Similarly, we can issue the same warning when showing the user the private key for Seqera. If they lose the key, we can just generate another private key. In summary, there is no need to write the private key to |
|
I implemented the bytearray clearing. The rest are also good ideas but will be deferred to a future release. |
| result = sensitive_buffer.decode('utf-8') | ||
|
|
||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decoding this to a string does defeat the point of using a bytearray. If we could avoid that somehow, that would be better. The problem is that we do not know when the string will be garbage collected and we cannot control when the string will be overwritten.
If elimanating the use of string is not possible, minimizing the number of references to the content would be helpful. In this case, just return directly rather than adding a named reference in result.
| result = sensitive_buffer.decode('utf-8') | |
| return result | |
| return sensitive_buffer.decode('utf-8') |
| key_content = _read_file_secure(private_key_path) | ||
| return SSHKeyContent(key=key_content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| key_content = _read_file_secure(private_key_path) | |
| return SSHKeyContent(key=key_content) | |
| return SSHKeyContent( | |
| key=_read_file_secure(private_key_path) | |
| ) |
Remove extra references to the string.
|
Decoding the bytearray to a string largely defeats the point of using the bytearray. The problem is that we cannot control when the string will be cleared from memory. At minimum, we could try to reduce references to the string to increase the chances that it can be immediately garbage collected. Gemini suggests that it might be possible to keep using a bytearray and avoid the string by using a custom Response class. from fastapi import FastAPI, Response
import gc
app = FastAPI()
class SecureResponse(Response):
def __init__(self, content: bytearray, *args, **kwargs):
super().__init__(content=content, *args, **kwargs)
# Store a reference to the mutable bytearray to clear it later
self._sensitive_content = content
async def __call__(self, scope, receive, send):
# 1. Standard response logic sends the data to the client
await super().__call__(scope, receive, send)
# 2. Immediately zero out the sensitive data after sending
for i in range(len(self._sensitive_content)):
self._sensitive_content[i] = 0
# 3. Explicitly clear the reference and suggest garbage collection
del self._sensitive_content
gc.collect()
@app.get("/secure-data")
async def get_secure_data():
# sensitive_data must be a mutable bytearray
sensitive_data = bytearray(b"SENSITIVE_SECRET_2026")
return SecureResponse(
content=sensitive_data,
media_type="application/octet-stream"
)I would consider a try-finally around the call to |
|
Using the following procedure I was able to identify multiple copies of I then got the following results, showing two copies. This confirms to me that Python is hanging on to the contents of the private key in memory. While I think there may be ways to eliminate the private key from Python's memory, I am questioning if ssh key generation even needs to be done on the server-side at all. There appears to be at least one project, js-keygen, that is meant for creating ssh keys. The LLMs suggest that the Web Crypto API is now sufficient for the purpose of key generation. // Native Ed25519 generation in modern browsers (2025/2026)
async function generateNativeEd25519() {
const keyPair = await window.crypto.subtle.generateKey(
{ name: "Ed25519" },
true, // Must be extractable to export for Seqera
["sign", "verify"]
);
// To turn this into an SSH key, we must export the raw bytes
const pubRaw = await window.crypto.subtle.exportKey("raw", keyPair.publicKey);
const privRaw = await window.crypto.subtle.exportKey("pkcs8", keyPair.privateKey);
return { pubRaw, privRaw };
}The bulk of the remaining code is reformatting the output of this function into OpenSSH format. Thus my current suggestion is that we generate the key pairs on the client side in the browser. In this case the only thing to keep here is code to add the public key to the authorized_keys file. Only the public key needs to be transmitted to the server for that purpose. |
That's for sure an interesting idea, though we need to think through the implications of adding this kind of complexity to the client. Additional dependencies will bloat the downloads required to have the page load. Use of bleeding edge APIs could create browser compatibility issues. More broadly, it feels worth thinking about all this in context of our threat model. What are we actually defending against here? Fileglancer runs on the internal network, and host access is restricted to a small number of admins. Given that context, how much value does each of these additional security features provide, when you weigh each one against its cost? |


This PR adds an SSH key manager to the profile menu in the upper right. This lets you create SSH keys and authorize them so that you can seamlessly move between cluster nodes. The main purpose for this is to allow easy on boarding for our Seqera Platform instance.
I tried to be careful about making backups of files in ~/.ssh before changing them, but would welcome feedback especially if there are any edge cases I didn't think of. If you are testing this, I would recommend making a backup of your ~/.ssh directory first, just in case.
@JaneliaSciComp/fileglancer @StephanPreibisch