COA Data Extraction API
The api/data/coas
API endpoint allows users to extract lab results from certificates of analysis (COAs) through PDFs, images that contain QR codes of the COA URL, or directly from the COA URLs. The endpoint also provides functionality to get, query, and delete parsed lab results. Users can pass the COA data in the request body using the following formats:
- PDFs: Users can upload COA PDF files directly as part of the request using the file field. The API supports a maximum of 100 files per request, and each file should not exceed 5 MB in size.
- Images: Users can upload images that contain QR codes of the COA URLs using the file field. The API accepts PNG, JPG, and JPEG image formats.
- URLs: Users can provide COA URLs in the request body using the urls field as an array.
Example
Post COA URLS and files to have the data extracted and returned.
POST /api/data/coas
Content-Type: multipart/form-data
Authorization: Bearer <token>
{
"urls": ["https://coa-url-1", "https://coa-url-2"],
"file": [PDF_FILE_1, PDF_FILE_2, IMAGE_FILE_1, IMAGE_FILE_2]
}
Upon successful extraction, the API will return a JSON response with the extracted COA data.
HTTP/1.1 200 OK
Content-Type: application/json
{
"success": true,
"data": [
{
"coa_pdf": "Pineapple-XX-5-13-2129146.pdf",
"coa_hash": "9501acee692a29f309b618ac994179d274753de3c4131837af2d8e552920ec95",
"analyses": "[\"terpenes\", \"pesticide\", \"microbiological\", \"water_activity\", \"mycotoxin\", \"residual_solvents\", \"potency\", \"heavy_metals\"]",
"potency_status": "pass",
"terpenes_status": "pass",
"microbiological_status": "pass",
"mycotoxin_status": "NT",
"residual_solvents_status": "NT",
"heavy_metals_status": "pass",
"pesticide_status": "pass",
"water_activity_status": "pass",
"methods": null,
"date_collected": null,
"date_tested": null,
"date_received": "2021-05-13T00:00:00",
"lab": "Genesis Testing Labs",
"lab_address": "1620 South Main St, Unit A, Grove, OK 74344",
"lab_street": "1620 South Main St",
"lab_city": "Grove",
"lab_state": "OK",
"lab_zipcode": "74344",
"distributor": null,
"distributor_address": null,
"distributor_street": null,
"distributor_city": null,
"distributor_state": null,
"distributor_zipcode": null,
"distributor_license_number": null,
"producer": "On The Hill",
"producer_address": "7703 W 7st Street, Tulsa, OK 74127",
"producer_street": "7703 W 7st Street",
"producer_city": "Tulsa",
"producer_state": "OK",
"producer_zipcode": "74127",
"producer_license_number": "GAAA-4JCT-WABI",
"product_name": "Pineapple XX",
"lab_id": "SA-051321-8070",
"product_type": "flower",
"batch_number": null,
"traceability_ids": null,
"product_size": null,
"serving_size": null,
"servings_per_package": null,
"sample_weight": null,
"status": "pass",
"total_cannabinoids": null,
"total_thc": 18.2281,
"total_cbd": null,
"total_terpenes": 1.7241,
"sample_id": "ae39227764932d53abbfe37ceb0e5f88c84ef9fce8e545a0135cdeedd3e41e04",
"strain_name": "Pineapple XX",
"coa_algorithm": "coa_ai.py",
"coa_algorithm_entry_point": "parse_coa_with_ai",
"coa_algorithm_version": "0.0.15",
"coa_parsed_at": "2023-06-12T18:22:33.816071",
"images": "[]",
"results": "[]",
"results_hash": "0dac0a24ba0545e6812b172c25d78eecd449f6d2e3463357c8051c693dcfe1f2",
"sample_hash": "559540bbcf278a11efdacebde76fc9305eaf54b22199b0b3648f2837a009ae0b",
"warning": "This data was extracted by AI. Please verify it before using it. You can submit feedback to dev@cannlytics.com"
}
]
}
COA Metadata
Field | Example | Description |
---|---|---|
analyses |
["cannabinoids"] | A list of analyses performed on a given sample. |
{analysis}_status |
"pass" | The pass, fail, or N/A status for pass / fail analyses. |
methods |
[{"analysis: "cannabinoids", "method": "HPLC"}] | The methods used for each analysis. |
date_collected |
2022-04-20T04:20 | An ISO-formatted time when the sample was collected. |
date_tested |
2022-04-20T16:20 | An ISO-formatted time when the sample was tested. |
date_received |
2022-04-20T12:20 | An ISO-formatted time when the sample was received. |
lab |
"MCR Labs" | The lab that tested the sample. |
lab_address |
"85 Speen St, Framingham, MA 01701" | The lab's address. |
lab_street |
"85 Speen St" | The lab's street. |
lab_city |
"Framingham" | The lab's city. |
lab_state |
"MA" | The lab's state. |
lab_zipcode |
"01701" | The lab's zipcode. |
distributor |
"Fred's Dispensary" | The name of the product distributor, if applicable. |
distributor_address |
"420 State Ave, Olympia, WA 98506" | The distributor address, if applicable. |
distributor_street |
"420 State Ave" | The distributor street, if applicable. |
distributor_city |
"Olympia" | The distributor city, if applicable. |
distributor_state |
"WA" | The distributor state, if applicable. |
distributor_zipcode |
"98506" | The distributor zip code, if applicable. |
distributor_license_number |
"L-123" | The distributor license number, if applicable. |
producer |
"Grow House" | The producer of the sampled product. |
producer_address |
"3rd & Army, San Francisco, CA 55555" | The producer's address. |
producer_street |
"3rd & Army" | The producer's street. |
producer_city |
"San Francisco" | The producer's city. |
producer_state |
"CA" | The producer's state. |
producer_zipcode |
"55555" | The producer's zipcode. |
producer_license_number |
"L2Calc" | The producer's license number. |
product_name |
"Blue Rhino Pre-Roll" | The name of the product. |
lab_id |
"Sample-0001" | A lab-specific ID for the sample. |
product_type |
"flower" | The type of product. |
batch_number |
"Order-0001" | A batch number for the sample or product. |
traceability_ids |
["1A4060300002199000003445"] | A list of relevant traceability IDs. |
product_size |
2000 | The size of the product in milligrams. |
serving_size |
1000 | An estimated serving size in milligrams. |
servings_per_package |
2 | The number of servings per package. |
sample_weight |
1 | The weight of the product sample in grams. |
status |
"pass" | The overall pass / fail status for all contaminant screening analyses. |
total_cannabinoids |
14.20 | The analytical total of all cannabinoids measured. |
total_thc |
14.00 | The analytical total of THC and THCA. |
total_cbd |
0.20 | The analytical total of CBD and CBDA. |
total_terpenes |
0.42 | The sum of all terpenes measured. |
sample_id |
"{sha256-hash}" | A generated ID to uniquely identify the producer , product_name , and date_tested . |
strain_name |
"Blue Rhino" | A strain name, if specified. Otherwise, can be attempted to be parsed from the product_name . |
COA Results
The results are a JSON string representation, for example:
[
{
"analysis": "cannabinoids",
"key": "thca",
"name": "THC-A",
"value": 14.20,
"mg_g": 142,
"units": "percent",
"limit": null,
"lod": null,
"loq": null,
"status": null
}
]
Field | Example | Description |
---|---|---|
analysis |
"pesticides" | The analysis used to obtain the result. |
key |
"pyrethrins" | A standardized key for the result analyte. |
name |
"Pyrethrins" | The lab's internal name for the result analyte |
value |
0.42 | The value of the result. |
mg_g |
0.00000042 | The value of the result in milligrams per gram. |
units |
"ug/g" | The units for the result value , limit , lod , and loq . |
limit |
0.5 | A pass / fail threshold for contaminant screening analyses. |
lod |
0.01 | The limit of detection for the result analyte. Values below the lod are typically reported as ND . |
loq |
0.1 | The limit of quantification for the result analyte. Values above the lod but below the loq are typically reported as <LOQ . |
status |
"pass" | The pass / fail status for contaminant screening analyses. |
Limitations
- Maximum number of files that can be parsed in one request: 10
- Maximum file size for a single file: 100 MB
- Supported file types: PDF, PNG, JPG, JPEG
- Maximum number of observations that can be downloaded at once: 200,000
Examples
import requests
# Define the API URL.
api_url = "https://cannlytics.com/api/data/coas"
# Define a COA URL to parse.
coa_url = "https://cannlytics.page.link/test-coa"
headers = {"Authorization": "Bearer <token>"}
data = {"urls": [coa_url]}
# Parse a COA URL with the API.
response = requests.post(api_url, headers=headers, json=data)
extracted = response.json()
print(extracted["data"])
# Parse a COA PDF with the API.
doc = 'coa.pdf'
with open(doc, 'rb') as pdf:
files = {'file': pdf}
response = requests.post(url, files=files, headers=headers)
extracted = response.json()
print(extracted["data"])