|
1 | | -# mindee-python |
2 | | - |
3 | | - |
4 | | -The documentation for the Mindee API can be found [here]. |
5 | | - |
6 | | -The Python library documentation can be found [here]. |
7 | | - |
8 | | -# Contents |
9 | | - |
10 | | -1. [Installation](#installation) |
11 | | -2. [Getting started](#getting-started) |
12 | | - * [API Credentials](#api-credentials) |
13 | | - * [Client response structure](#client-response-structure) |
14 | | -3. [Parse a receipt](#components) |
15 | | - * [Receipt objects data](#receipt-objects-data) |
16 | | - * [Receipt objects methods](#receipt-objects-methods) |
17 | | -4. [Parse an invoice](#components) |
18 | | - * [Invoice objects data](#receipt-objects-data) |
19 | | - * [Invoice objects methods](#receipt-objects-methods) |
20 | | -5. [Parse invoice and receipt in a single endpoint](#components) |
21 | | - * [Financial Document objects data](#receipt-objects-data) |
22 | | - * [Financial Document objects methods](#receipt-objects-methods) |
23 | | -6. [Parse a passport](#components) |
24 | | - * [Passport objects data](#receipt-objects-data) |
25 | | - * [Passport objects methods](#receipt-objects-methods) |
26 | | - |
27 | | -## Installation |
28 | | - |
29 | | -Install from PyPi using [pip](https://pip.pypa.io/en/latest/), a |
30 | | -package manager for Python. |
31 | | - |
32 | | - pip install mindee |
33 | | - |
34 | | -If pip install fails on Windows, check the path length of the directory. If it is greater 260 characters then enable [Long Paths](https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation) or choose other shorter location. |
35 | | - |
36 | | -Don't have pip installed? Try installing it, by running this from the command |
37 | | -line: |
38 | | - |
39 | | - $ curl https://bootstrap.pypa.io/get-pip.py | python |
40 | | - |
41 | | -You may need to run the above commands with `sudo`. |
42 | | - |
43 | | -## Getting Started |
44 | | - |
45 | | -Getting started with the Mindee API couldn't be easier. Create a |
46 | | -`Client` and you're ready to go. |
47 | | - |
48 | | -### API Credentials |
49 | | - |
50 | | -The `Mindee` needs your API credentials. You can either pass these |
51 | | -directly to the constructor (see the code below) or via environment variables. |
52 | | - |
53 | | -Depending on what type of document you want to parse, you need to add |
54 | | -specifics auth token for each endpoint. |
55 | | -```python |
56 | | -from mindee import Client |
57 | | - |
58 | | -mindee_client = Client( |
59 | | - expense_receipt_token="your_expense_receipts_api_token_here", |
60 | | - invoice_token="your_invoices_api_token_here", |
61 | | - passport_token="your_passport_api_token_here" |
62 | | -) |
63 | | -``` |
64 | | - |
65 | | -We suggest storing your credentials as environment variables. Why? You'll never |
66 | | -have to worry about committing your credentials and accidentally posting them |
67 | | -somewhere public. |
68 | | - |
69 | | -### Client response structure |
70 | | - |
71 | | -The client object contains different parsing method specific for each type of |
72 | | - document supported by Mindee API. |
73 | | - |
74 | | -Examples: |
75 | | - |
76 | | -```python |
77 | | -from mindee import Client |
78 | | - |
79 | | -mindee_client = Client( |
80 | | - expense_receipt_token="your_expense_receipts_api_token_here", |
81 | | - invoice_token="your_invoices_api_token_here", |
82 | | - passport_token="your_passport_api_token_here" |
83 | | -) |
84 | | - |
85 | | -# This is a dummy example, see other methods below for real examples |
86 | | -parsed_data = mindee_client.parse_document_xxx("/path/to/file") |
87 | | -``` |
88 | | - |
89 | | -Each object returned by parsing methods follows the same structure: |
90 | | - |
91 | | -#### parsed_data.document |
92 | | -This attribute is the Document object constructed by gathering all the pages into a |
93 | | -single document. If you many objects for multi pages pdfs, see data.pages. |
94 | | -```python |
95 | | -parsed_data.document # returns a unique object from class DocumentXXX |
96 | | -``` |
97 | | - |
98 | | - |
99 | | -#### parsed_data.pages |
100 | | -For multi pages pdf, the 'pages' attribute is a list of documents objects, each object |
101 | | -is constructed using a unique page of the pdf; |
102 | | -```python |
103 | | -parsed_data.pages # [DocumentXXX, DocumentXXX ...] |
104 | | -``` |
105 | | - |
106 | | - |
107 | | -#### parsed_data.http_response |
108 | | -Contains the full Mindee API response object |
109 | | - ```python |
110 | | -parsed_data.http_response # full HTTP request object |
111 | | -``` |
112 | | - |
113 | | -## Parse a receipt |
114 | | - |
115 | | -```python |
116 | | -from mindee import Client |
117 | | - |
118 | | -mindee_client = Client( |
119 | | - expense_receipt_token="your_expense_receipts_api_token_here" |
120 | | -) |
121 | | - |
122 | | -receipt_data = mindee_client.parse_receipt('./path/to/receipt.jpg') |
123 | | -print(receipt_data.document) |
124 | | -``` |
125 | | - |
126 | | -#### Receipt objects data |
127 | | -Here are the different fields extracted and examples on how to get them from a Receipt object |
128 | | -* locale |
129 | | -```python |
130 | | -receipt_data.document.locale.value # en-US (string) |
131 | | -receipt_data.document.locale.language # en (string) |
132 | | -receipt_data.document.locale.country # US (string) |
133 | | -receipt_data.document.locale.probability # 0.89 (float) |
134 | | -``` |
135 | | -* total_incl |
136 | | -```python |
137 | | -receipt_data.document.total_incl.value # 144.97 (float) |
138 | | -receipt_data.document.total_incl.probability # 0.89 (float) |
139 | | -``` |
140 | | -* date |
141 | | -```python |
142 | | -receipt_data.document.date.value # 2020-12-04 (float) |
143 | | -receipt_data.document.date.date_object # Object (datetime.date object) |
144 | | -receipt_data.document.date.probability # 0.99 (float) |
145 | | -``` |
146 | | -* merchant_name |
147 | | -```python |
148 | | -receipt_data.document.merchant_name.value # Amazon (string) |
149 | | -receipt_data.document.merchant_name.probability # 0.97 (float) |
150 | | -``` |
151 | | -* time |
152 | | -```python |
153 | | -receipt_data.document.time.value # 15:02 (string) |
154 | | -receipt_data.document.time.probability # 0.44 (float) |
155 | | -``` |
156 | | -* orientation |
157 | | -```python |
158 | | -receipt_data.document.orientation.value # 90 (int) |
159 | | -receipt_data.document.orientation.probability # 0.97 (float) |
160 | | -``` |
161 | | -* total_tax |
162 | | -```python |
163 | | -receipt_data.document.total_tax.value # 12.48 (float) |
164 | | -receipt_data.document.total_tax.probability # 0.97 (float) |
165 | | -``` |
166 | | -* taxes |
167 | | -```python |
168 | | -receipt_data.document.taxes # List of Tax objects |
169 | | - |
170 | | -receipt_data.document.taxes[0].value # 2.41 (float) |
171 | | -receipt_data.document.taxes[0].probability # 0.45 (float) |
172 | | -receipt_data.document.taxes[0].rate # 0.2 (float) |
173 | | -``` |
174 | | - |
175 | | -#### Receipt objects methods |
176 | | - |
177 | | -## Parse a passport |
178 | | - |
179 | | -```python |
180 | | -from mindee import Client |
181 | | - |
182 | | -mindee_client = Client( |
183 | | - passport_token="your_passport_api_token_here" |
184 | | -) |
185 | | - |
186 | | -passport_data = mindee_client.parse_passport('./path/to/passport.jpeg') |
187 | | -print(passport_data.document) |
188 | | - |
189 | | -``` |
190 | | - |
191 | | - |
192 | | -## Parse an invoice |
193 | | - |
194 | | -```python |
195 | | -from mindee import Client |
196 | | - |
197 | | -mindee_client = Client( |
198 | | - invoice_token="your_invoices_api_token_here" |
199 | | -) |
200 | | - |
201 | | -invoice_data = mindee_client.parse_invoice("./path/to/invoice.pdf") |
202 | | -print(invoice_data.document) |
203 | | -``` |
204 | | - |
205 | | - |
206 | | -## Receipts and invoices with a single endpoint |
207 | | - |
208 | | -```python |
209 | | -from mindee import Client |
210 | | - |
211 | | -mindee_client = Client( |
212 | | - expense_receipt_token="your_expense_receipts_api_token_here", |
213 | | - invoice_token="your_invoices_api_token_here" |
214 | | -) |
215 | | - |
216 | | -financial_document = mindee_client.parse_financial_document("./path/to/invoice.pdf/or/receipt.jpg") |
217 | | -print(financial_document.document) |
218 | | -``` |
| 1 | +# mindee-python |
0 commit comments