added support for Django ORM Models & Python Classes

xnuinside · xnuinside · commit b15edf837e0e · 2021-05-09T01:20:29.000+03:00
diff --git a/CHANGELOG.txt b/CHANGELOG.txt
@@ -1,6 +1,7 @@
 **v0.3.0**
 1. Added cli - `pmp` command with args -d, --dump  
-2. Add support for pure Python classes
+2. Added support for simple Django ORM models
+3. Added base support for pure Python Classes
 
 **v0.2.0**
 1. Added support for Dataclasses
diff --git a/README.md b/README.md
@@ -10,9 +10,11 @@ Py-Models-Parser can parse & extract information from models:
 - Sqlalchemy ORM,
 - Gino ORM,
 - Tortoise ORM,
+- Django ORM Model,
 - Pydantic,
 - Python Enum,
 - Python Dataclasses
+- pure Python Classes
 
 Number of supported models will be increased, check 'TODO' section.
 
@@ -168,15 +170,21 @@ For model from point 1 (above) library will produce the result:
 
 ## TODO: in next Release
 
-1. Add more tests for supported models (and fix existed not covered cases): Pydantic, Enums, Dataclasses, SQLAlchemy Models, GinoORM models, TortoiseORM models
-2. Add support for pure SQLAlchemy Core Tables
+1. Add more tests for supported models (and fix existed not covered cases): Django ORM, Pydantic, Enums, Dataclasses, SQLAlchemy Models, GinoORM models, TortoiseORM models
+2. Add support for SQLAlchemy Core Tables
 3. Add support for Pony ORM models
+4. Add support for Piccolo ORM models
 
 ## Changelog
+**v0.3.0**
+1. Added cli - `pmp` command with args -d, --dump  
+2. Added support for simple Django ORM models
+3. Added base support for pure Python Classes
+
 **v0.2.0**
 1. Added support for Dataclasses
 2. Added parse_from_file method
-3. Added correct work with types with comma inside, like: Union[dict, list] or Union[dict, list, tuple, anything]
+3. Added correct work with types with comma inside, like: Union[dict, list] or Union[dict, list, tuple, anything] 
 
 **v0.1.1**
 1. Added base parser logic & tests for Pydantic, Enums, SQLAlchemy Models, GinoORM models, TortoiseORM models 
diff --git a/docs/README.rst b/docs/README.rst
@@ -21,7 +21,20 @@ Py-Models-Parser
 
 
 It's as second Parser that done by me, first is a https://github.com/xnuinside/simple-ddl-parser for SQL DDL with different dialects.
-Py-Models-Parser supports now ORM Sqlalchemy, Gino, Tortoise; Pydantic, Python Enum models, Dataclasses & in nearest feature I plan to add pure pyton classes. And next will be added other ORMs models.
+
+Py-Models-Parser can parse & extract information from models:
+
+
+* Sqlalchemy ORM,
+* Gino ORM,
+* Tortoise ORM,
+* Django ORM Model,
+* Pydantic,
+* Python Enum,
+* Python Dataclasses
+* pure Python Classes
+
+Number of supported models will be increased, check 'TODO' section.
 
 Py-Models-Parser written with PEG parser and it's python implementation - parsimonious. It's pretty new and I did not cover all possible test cases, so if you will have an issue  - please just open an issue in this case with example, I will fix it as soon as possible.
 
@@ -69,7 +82,8 @@ How to use
 
 Library detect automaticaly that type of models you tries to parse. You can check a lot of examples in test/ folder on the GitHub
 
-You can parse models from python string:
+
+#. You can parse models from python string:
 
 .. code-block:: python
 
@@ -92,7 +106,8 @@ You can parse models from python string:
        """
    result = parse(models_str)
 
-or just provide the path to file:
+
+#. Parse models from file:
 
 .. code-block:: python
 
@@ -104,7 +119,31 @@ or just provide the path to file:
        # for example: tests/data/dataclass_defaults.py
        result = parse_from_file(file_path)
 
-It will produce the result:
+
+#. Parse models from file with command line
+
+.. code-block:: bash
+
+
+       pmp path_to_models.py 
+
+       # for example: pmp tests/data/dataclass_defaults.py
+
+Output from cli can be dumped in 'output_models.json' file - use flag '-d' '--dump' if you want to change target file name, provide it after argument like '-d target_file.json'
+
+.. code-block:: bash
+
+
+       # example how to dump output from cli
+
+       pmp path_to_models.py -d target_file.json
+
+Output example
+^^^^^^^^^^^^^^
+
+You can find a lot of output examples in tests - https://github.com/xnuinside/py-models-parser/tree/main/tests
+
+For model from point 1 (above) library will produce the result:
 
 .. code-block:: python
 
@@ -153,15 +192,21 @@ TODO: in next Release
 ---------------------
 
 
-#. Parse from file method
-#. Add cli
-#. Add more tests for supported models (and fix existed not covered cases): Pydantic, Enums, Dataclasses, SQLAlchemy Models, GinoORM models, TortoiseORM models
-#. Add support for pure Python classes
-#. Add support for pure SQLAlchemy Core Tables
+#. Add more tests for supported models (and fix existed not covered cases): Django ORM, Pydantic, Enums, Dataclasses, SQLAlchemy Models, GinoORM models, TortoiseORM models
+#. Add support for SQLAlchemy Core Tables
+#. Add support for Pony ORM models
+#. Add support for Piccolo ORM models
 
 Changelog
 ---------
 
+**v0.3.0**
+
+
+#. Added cli - ``pmp`` command with args -d, --dump  
+#. Added support for simple Django ORM models
+#. Added base support for pure Python Classes
+
 **v0.2.0**
 
 
diff --git a/py_models_parser/grammar.py b/py_models_parser/grammar.py
@@ -2,21 +2,24 @@
 
 grammar = Grammar(
     r"""
-    expr = (class / call_result / attr_def / emptyline)*
-    class = class_def attr_def* ws?
+    expr = (class / call_result / attr_def / emptyline / funct_def)*
+    class = class_def attr_def* funct_def* ws?
     class_def   = intend? class_name args? ":"* ws?
-    attr_def  = intend? id ("=" right_part)* ws?
-    right_part = args / call_result / id / string / text
+    attr_def  = intend? id type? ("=" (right_part))* ws?
+    right_part =  (id args_in_brackets) / string / args  / call_result / args_in_brackets / id / text
+    type = ":" ( (id args_in_brackets) / id)
     string = one_quote_str / double_quotes_str
-    one_quote_str = ~"\'[^\']+\'"
-    double_quotes_str = ~'"[^\"]+"'
+    one_quote_str = ~"\'[^\']+\'"i
+    double_quotes_str = ~'"[^\"]+"'i
+    funct_def = intend? "def" id args? ":"* ws?
+    args_in_brackets = "[" ((id/string)* ","* )* "]"
     args     = "(" (( call_result / args / attr_def / id )* ","* )* ")"
     call_result = id args ws?
     class_name  = "class" id
-    id          = (((dot_id / text)+ ","*) *  / dot_id / text) ws?
+    id          = (((dot_id / text)+ ) *  / dot_id / text) ws?
     dot_id      = (text".")*text
-    intend      = "    " / "\t"
-    text        = !class ~"['\_A-Z 0-9\{\}\[\]_\"\-\/\$:<%>\w]*"i
+    intend      = "    " / "\t" / "\n"
+    text        = !class ~"['\_A-Z 0-9\{\}_\"\-\/\$<%>\+\-\w]*"i
     ws          = ~"\s*"
     emptyline   = ws+
 """
diff --git a/py_models_parser/visitor.py b/py_models_parser/visitor.py
@@ -1,4 +1,4 @@
-from typing import Dict
+from typing import Dict, Tuple
 
 from parsimonious.nodes import NodeVisitor
 
@@ -47,7 +47,7 @@ def extract_orm_attr(self, text: str):
         default = None
         not_orm = True
         properties = {}
-        orm_columns = ["Column", "Field", "relationship"]
+        orm_columns = ["Column", "Field", "relationship", "ForeignKey"]
         for i in orm_columns:
             if i in text:
                 not_orm = False
@@ -56,16 +56,15 @@ def extract_orm_attr(self, text: str):
                 base_text = text
                 text = text[index + 1 : -1]  # noqa E203
                 text = text.split(",")
-
                 text = self.clean_up_cases_with_inner_pars(text)
                 if i == "Field":
-                    # for tortoise orm
-                    split_by_field = base_text.split("Field")[0].split(".")
-                    if len(split_by_field) == 2:
-                        _type = split_by_field[1]
-                    else:
-                        _type = split_by_field[0]
+                    _type, properties = get_django_info(text, base_text, properties)
                     prop_index = 0
+                elif i == "ForeignKey":
+                    # mean it is a Django model.ForeignKey
+                    _type = "serial"
+                    properties["foreign_key"] = text[0]
+                    prop_index = 1
                 else:
                     prop_index = 1
                     _type = text[0]
@@ -108,33 +107,48 @@ def visit_attr_def(self, node, visited_children):
         _type = None
         if "def " in left:
             attr = {"attr": {"name": None, "type": _type, "default": default}}
+
             return attr
         if ":" in left:
             _type = left.split(":")[-1].strip()
             left = left.split(":")[0].strip()
         attr = {"attr": {"name": left, "type": _type, "default": default}}
         for children in visited_children:
+
             if isinstance(children, list):
                 if isinstance(children[-1], list):
                     if "default" in children[-1][-1]:
                         attr["attr"]["default"] = children[-1][-1]["default"]
                         attr["attr"]["properties"] = children[-1][-1]["properties"]
                         if children[-1][-1]["type"] is not None:
                             attr["attr"]["type"] = children[-1][-1]["type"]
+                elif isinstance(children[-1], dict) and "type" in children[-1]:
+                    attr["attr"]["type"] = children[-1]["type"]
         return attr
 
     def process_chld(self, child, final_child):
         if "attr" in child and child["attr"]["name"]:
-            if "tablename" in child["attr"]["name"]:
+            # todo: this is a hack, need refactor it
+            if child["attr"]["name"] == "self" and not final_child["properties"].get(
+                "init"
+            ):
+                final_child["properties"]["init"] = []
+            elif "tablename" in child["attr"]["name"]:
                 final_child["properties"]["table_name"] = child["attr"]["default"]
             elif "table_args" in child["attr"]["name"]:
                 final_child["properties"][child["attr"]["name"]] = (
                     child["attr"]["type"] or child["attr"]["default"]
                 )
             else:
-                final_child["attrs"].append(child["attr"])
+                if final_child["properties"].get("init") is not None:
+                    final_child["properties"]["init"].append(child["attr"])
+                else:
+                    final_child["attrs"].append(child["attr"])
         else:
-            if isinstance(child, dict):
+
+            if "attr" in child:
+                final_child = process_no_name_attrs(final_child, child)
+            elif isinstance(child, dict):
                 final_child.update(child)
             elif isinstance(child, list):
                 for i in child:
@@ -162,8 +176,43 @@ def visit_expr(self, node, visited_children):
                 n += 1
             if "attr" in final_child:
                 del final_child["attr"]
+            if final_child["properties"].get("init") == []:
+                del final_child["properties"]["init"]
+            elif final_child["properties"].get("init"):
+                if not children_values[n]["properties"].get("init"):
+                    children_values[n]["properties"]["init"] = final_child[
+                        "properties"
+                    ]["init"]
+
         return children_values
 
+    def visit_type(self, node, visited_children):
+        _index = node.text.find(":")
+        _type = node.text[_index + 1 :]  # noqa: E203
+        return {"type": _type.strip()}
+
     def generic_visit(self, node, visited_children):
         """The generic visit method."""
         return visited_children or node
+
+
+def process_no_name_attrs(final_child: Dict, child: Dict) -> None:
+    if child["attr"]["default"]:
+        final_child["attrs"][-1]["default"] = child["attr"]["default"]
+        if not final_child["attrs"][-1].get("properties"):
+            final_child["attrs"][-1]["properties"] = {}
+    elif child["attr"]["type"]:
+        final_child["attrs"][-1]["default"] += f':{child["attr"]["type"]}'
+    return final_child
+
+
+def get_django_info(text: list, base_text: str, properties: Dict) -> Tuple:
+    #    for tortoise orm & django orm
+    split_by_field = base_text.split("Field")[0].split(".")
+    if len(split_by_field) == 2:
+        _type = split_by_field[1]
+    else:
+        _type = split_by_field[0]
+    if _type == "ManyToMany":
+        properties["foreign_key"] = text[0]
+    return _type, properties
diff --git a/tests/test_django.py b/tests/test_django.py
diff --git a/tests/test_python_class.py b/tests/test_python_class.py