![]() |
In software projects, there is quite often a requirement for conversion of a given file (HTML/TXT/etc.,) to a PDF file and similarly, any PDF file needs to get converted to HTML/TXT/etc., files. Even PDFs need to be stored as images of type PNG or GIF etc., Via a sample maven project, let us see the same. As it is the maven project, necessary dependencies need to be added in pom.xml Essential Library is PDF2Dom: <!-- To load the selected PDF file --> <dependency> <groupId>org.apache.pdfbox</groupId> <artifactId>pdfbox-tools</artifactId> <version>2.0.25</version> </dependency> <!-- To load the selected PDF file --> <!-- Required for conversion --> <dependency> <groupId>net.sf.cssbox</groupId> <artifactId>pdf2dom</artifactId> <version>2.0.1</version> </dependency> A few more dependencies are also needed. iText is needed to extract the text from a given PDF file. POI is needed to create the .docx document. <dependency> <groupId>com.itextpdf</groupId> <artifactId>itextpdf</artifactId> <version>5.5.10</version> </dependency> <dependency> <groupId>com.itextpdf.tool</groupId> <artifactId>xmlworker</artifactId> <version>5.5.10</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>3.15</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-scratchpad</artifactId> <version>3.15</version> </dependency> Example Maven ProjectLet us start with the project structure and pom.xml and then will look for the required source code to convert from PDF to other formats as well as from other formats to HTML ![]()
pom.xml XML
Let us see important key files 1. PDF and HTML conversion ConversionOfPDF2HTMLExample.java In the below program, both methods are handled i.e.
Java
2. PDF and Image Conversions PDF can be converted to Images in many ways and one important way is Apache PDFBox again from image to PDF can be converted by using iText ConversionOfPDF2ImageExample.java In the below program, the following methods are handled
Java
3. PDF and Text Conversions For this also Apache PDFBox is needed to get the text from PDF files and iText is required for text-to-pdf conversion.
ConversionOfPDF2TextExample.java Java
4. PDF and DocX Conversions Two libraries are needed. i.e.
ConversionOfPDF2WordExample.java Java
Code Explanation Video: ConclusionIn many stages of software projects, there are requirements for conversion of text, and image to PDF, and similarly conversion of data from PDF to text, image, and Docx format. The above examples help the best way to do this in Java. |
Reffered: https://www.geeksforgeeks.org
Technical Scripter |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 12 |