Describe the bug
The ResultVisitor.visitStringValue() method in apache-age-python uses .strip('"') to remove surrounding quotes from agtype string tokens. Python's str.strip() removes all matching characters from both ends, not just one. This causes data corruption when a string property value starts or ends with an escaped double quote (\"), because the " that is part of the actual data gets stripped along with the delimiter quote. The same bug exists in visitPair().
For example, a property value foo "bar" is serialized by AGE as the agtype token "foo \"bar\"". Calling .strip('"') produces foo \"bar\ instead of the correct foo \"bar\".
The fix is to replace .strip('"') with [1:-1] to remove exactly the first and last character.
It has to be checked whether there are cases when no leading or trailing double quotes exist, which would mean the fix would wrongfully delete content in that case.
How are you accessing AGE (Command line, driver, etc.)?
apache-age-python driver (via psycopg + AGE Python client)
What data setup do we need to do?
SELECT * FROM cypher('test_graph', $$
CREATE (a:TestNode {name: 'This value ends with a "quote"'})
$$) AS (a agtype);
What is the necessary configuration info needed?
- PostgreSQL with Apache AGE extension
apache-age-python package (tested with latest PyPI version)
What is the command that caused the error?
import age
age.setUpAge(conn, "test_graph")
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM cypher('test_graph', $$
MATCH (a:TestNode) RETURN a.name
$$) AS (name agtype);
""")
for row in cursor:
result = age.parseAgeValue(row[0])
print(repr(result))
# Expected: 'This value ends with a "quote"'
# Actual: 'This value ends with a "quote\\'
The root cause in builder.py:
# Current (broken):
def visitStringValue(self, ctx:AgtypeParser.StringValueContext):
return ctx.STRING().getText().strip('"')
Expected behavior
String property values containing double quotes should survive a round-trip (write → read) without data loss. visitStringValue() and visitPair() should remove exactly the first and last delimiter characters, not strip all matching characters from both ends.
Environment (please complete the following information):
- AGE: 1.6.0
apache-age-python: latest (PyPI)
- Python: 3.12
- PostgreSQL: 16
Additional context
The visitPair() method has the same issue when parsing map keys from agtype objects:
# Also broken:
def visitPair(self, ctx:AgtypeParser.PairContext):
self.visitChildren(ctx)
return (ctx.STRING().getText().strip('"'), ctx.agValue())
Describe the bug
The
ResultVisitor.visitStringValue()method inapache-age-pythonuses.strip('"')to remove surrounding quotes from agtype string tokens. Python'sstr.strip()removes all matching characters from both ends, not just one. This causes data corruption when a string property value starts or ends with an escaped double quote (\"), because the"that is part of the actual data gets stripped along with the delimiter quote. The same bug exists invisitPair().For example, a property value
foo "bar"is serialized by AGE as the agtype token"foo \"bar\"". Calling.strip('"')producesfoo \"bar\instead of the correctfoo \"bar\".The fix is to replace
.strip('"')with[1:-1]to remove exactly the first and last character.It has to be checked whether there are cases when no leading or trailing double quotes exist, which would mean the fix would wrongfully delete content in that case.
How are you accessing AGE (Command line, driver, etc.)?
apache-age-pythondriver (viapsycopg+ AGE Python client)What data setup do we need to do?
What is the necessary configuration info needed?
apache-age-pythonpackage (tested with latest PyPI version)What is the command that caused the error?
The root cause in
builder.py:Expected behavior
String property values containing double quotes should survive a round-trip (write → read) without data loss.
visitStringValue()andvisitPair()should remove exactly the first and last delimiter characters, not strip all matching characters from both ends.Environment (please complete the following information):
apache-age-python: latest (PyPI)Additional context
The
visitPair()method has the same issue when parsing map keys from agtype objects: