PDF Conversion API

PDF Conversion API is a powerful tool designed to extract Text or HTML from URL or File.

Developer Portal : https://api.market/store/magicapi/pdf-extract

About

Designed for developers and businesses alike, this powerful tool simplifies the extraction of valuable information from PDF documents. With just a few lines of code, you can harness the full potential of our API to convert PDF files into easily accessible text or HTML formats.

Using our API is a breeze. Simply integrate it into your application or workflow, and you're ready to go. Whether you're working with URLs or file uploads, our endpoints make it effortless to convert PDFs on the fly. Need to extract text from a PDF hosted online? Use the '/pdf-to-text-url/' endpoint. Want to convert a PDF file stored locally? The '/pdf-to-text-file/' endpoint has you covered. Similarly, if you're looking to generate HTML from PDFs, our '/pdf-to-html-url/' and '/pdf-to-html-file/' endpoints are at your service.

But why use our PDF Conversion API? The answer lies in its unparalleled efficiency and versatility. By automating the conversion process, our API saves you valuable time and resources. Say goodbye to manual extraction tasks and hello to streamlined document processing. Whether you're building content analysis tools, enhancing search functionality, or creating dynamic web pages, our API empowers you to unlock the full potential of your PDF documents.

And what about the output? When you convert a PDF to text, our API delivers plain text content in structured JSON format, making it easy to parse, analyse, or integrate into your applications. On the other hand, converting to HTML preserves the document's formatting, structure, and styling, allowing for seamless web publishing or content repurposing. With support for both text and HTML output formats, our API ensures flexibility and compatibility with a wide range of use cases.

Curl Request and Response :

For /pdf-to-html-url/ endpoint, the data would be :

Request :

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/magicapi/pdf-extract/pdf-to-html-url/' \
  -H 'accept: text/html' \
  -H 'x-magicapi-key: API-KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "pdf_url": "https://www.cedarville.edu/-/media/Files/PDF/Web-Development-Services/SamplePDF.pdf?la=en&hash=1B9D390C8225C1DDE2155F786C6515A3CEF9D4EC"
}'

Response :

{
  "html": "<html><head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">\n</head><body>\n<span style=\"position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:792px;\"></span>\n<div style=\"position:absolute; top:50px;\"><a name=\"1\">Page 1</a></div>\n<span style=\"position:absolute; border: black 1px solid; left:0px; top:50px; width:612px; height:792px;\"></span>\n<span style=\"position:absolute; border: black 1px solid; left:0px; top:50px; width:612px; height:792px;\"></span>\n<span style=\"position:absolute; border: black 1px solid; left:0px; top:50px; width:612px; height:792px;\"></span>\n<span style=\"position:absolute; border: black 1px solid; left:0px; top:122px; width:612px; height:720px;\"></span>\n<span style=\"font-family: ArialMT; font-size:11px\">Sample PDF  This is a sample PDF file that I will use to test the Sitecore publishing. "
}

For /pdf-to-text-url/ endpoint , the data would be :

Request :

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/magicapi/pdf-extract/pdf-to-text-url/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: clu2mnx050001l40cyq5h09t8' \
  -H 'Content-Type: application/json' \
  -d '{
  "pdf_url": "https://www.sbs.ox.ac.uk/sites/default/files/2019-01/cv-template.pdf"
}'

Response :

{
  "text": "Sample PDF\n \n \nThis is a sample PDF file that I will use to test the Sitecore publishing.\n "
}

For /pdf-to-text-file/ endpoint, the data would be :

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/magicapi/pdf-extract/pdf-to-text-file/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: API-KEY' \
  -H 'Content-Type: multipart/form-data' \
  -F 'pdf_file=@TestPDFfile.pdf;type=application/pdf'

Response :

{
    "text": "This is a test PDF file "
}

For /pdf-to-html-file/ endpoint, the data would be :

CURL :

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/magicapi/pdf-extract/pdf-to-html-file/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: API-KEY' \
  -H 'Content-Type: multipart/form-data' \
  -F 'pdf_file=@TestPDFfile.pdf;type=application/pdf'

Response :

{
    "html": "<html><head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">\n</head><body>\n<span style=\"position:absolute; border: gray 1px solid; left:0px; top:50px; width:612px; height:792px;\"></span>\n<div style=\"position:absolute; top:50px;\"><a name=\"1\">Page 1</a></div>\n<span style=\"font-family: Calibri; font-size:11px\">This is a test PDF file "
}

Last updated