You need to add following dependency for PDFBox.
org.apache.pdfbox pdfbox 2.0.13
Some of the classes which you'll be using for PDF generation using PDFBox.
First lets see a simple Java program where “Hello world” is written to the PDF using PDFBox library. This example also shows how to set font and text color for the content written to PDF using PDFBox.
import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDPage; import org.apache.pdfbox.pdmodel.PDPageContentStream; import org.apache.pdfbox.pdmodel.font.PDType1Font; public class PDFGenerator < public static final String DEST = "F:\\NETJS\\Test\\HelloWorld.pdf"; public static void main(String[] args) < try < PDDocument pdDoc = new PDDocument(); PDPage page = new PDPage(); // add page to the document pdDoc.addPage(page); // write to a page content stream try(PDPageContentStream cs = new PDPageContentStream(pdDoc, page))< cs.beginText(); // setting font family and font size cs.setFont(PDType1Font.HELVETICA, 14); // Text color in PDF cs.setNonStrokingColor(Color.BLUE); // set offset from where content starts in PDF cs.newLineAtOffset(20, 750); cs.showText("Hello! This PDF is created using PDFBox"); cs.newLine(); cs.endText(); >// save and close PDF document pdDoc.save(DEST); pdDoc.close(); > catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> >
If you want to add a new page to an existing PDF then you can load the existing PDF and add page to it.
import java.awt.Color; import java.io.File; import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDPage; import org.apache.pdfbox.pdmodel.PDPageContentStream; import org.apache.pdfbox.pdmodel.font.PDType1Font; public class PDFGenerator < public static final String DEST = "F:\\NETJS\\Test\\HelloWorld.pdf"; public static void main(String[] args) < try < // load exiting PDF PDDocument pdDoc = PDDocument.load(new File(DEST)); PDPage page = new PDPage(); // add page to the document pdDoc.addPage(page); // write to a page content stream try(PDPageContentStream cs = new PDPageContentStream(pdDoc, page))< cs.beginText(); // setting font family and font size cs.setFont(PDType1Font.HELVETICA, 14); // Text color in PDF cs.setNonStrokingColor(Color.BLUE); // set offset from where content starts in PDF cs.newLineAtOffset(20, 750); cs.showText("Adding content to an existing PDF"); cs.newLine(); cs.endText(); >// save and close PDF document pdDoc.save(new File("F:\\NETJS\\Test\\Changed.pdf")); pdDoc.close(); > catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> >
If you have text that may span multiple lines in PDF then you do need to write the logic to divide that text into multiple lines as per the width of the document. In PDFBox there is no such support and if you add long text directly then it will be written in PDF as a single line.
import java.awt.Color; import java.io.IOException; import java.util.ArrayList; import java.util.List; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDPage; import org.apache.pdfbox.pdmodel.PDPageContentStream; import org.apache.pdfbox.pdmodel.font.PDFont; import org.apache.pdfbox.pdmodel.font.PDType1Font; public class PDFGenerator < public static final String DEST = "F:\\NETJS\\Test\\MultiLine.pdf"; public static void main(String[] args) < try < PDDocument pdDoc = new PDDocument(); PDPage page = new PDPage(); // add page to the document pdDoc.addPage(page); // write to a page content stream try(PDPageContentStream cs = new PDPageContentStream(pdDoc, page))< cs.beginText(); // setting font family and font size cs.setFont(PDType1Font.HELVETICA, 14); // Text color in PDF cs.setNonStrokingColor(Color.BLUE); // set offset from where content starts in PDF cs.newLineAtOffset(20, 750); // required when using newLine() cs.setLeading(12); cs.showText("First line added to the PDF"); cs.newLine(); String text = "This is a long text which spans multiple lines, this text checks if the line is changed as per the allotted width in the PDF or not." + "The Apache PDFBox library is an open source tool written in Java for working with PDF documents. "; divideTextIntoMultipleLines(text, 580, page, cs, PDType1Font.HELVETICA, 14); cs.endText(); >// save and close PDF document pdDoc.save(DEST); pdDoc.close(); > catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> private static void divideTextIntoMultipleLines(String text, int allowedWidth, PDPage page, PDPageContentStream contentStream, PDFont font, int fontSize) throws IOException < Listlines = new ArrayList(); String line = ""; // split the text one or more spaces String[] words = text.split("\\s+"); for(String word : words) < if(!line.isEmpty()) < line += " "; >// check for width boundaries int size = (int) (fontSize * font.getStringWidth(line + word) / 1000); if(size > allowedWidth) < // if line + new word >page width, add the line to the list without the word lines.add(line); // start new line with the current word line = word; > else < // if line + word < page width, append the word to the line line += word; >> lines.add(line); // write lines to Content stream for(String ln : lines) < contentStream.showText(ln); contentStream.newLine(); >> >
import java.awt.Color; import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDPage; import org.apache.pdfbox.pdmodel.PDPageContentStream; import org.apache.pdfbox.pdmodel.font.PDType1Font; import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject; public class PDFWithImage < public static final String DEST = "F:\\NETJS\\Test\\image.pdf"; public static void main(String[] args) < try < PDDocument pdDoc = new PDDocument(); PDPage page = new PDPage(); // add page to the document pdDoc.addPage(page); // Create image object using the image location PDImageXObject image = PDImageXObject.createFromFile("F:\\NETJS\\netjs.png", pdDoc); // write to a page content stream try(PDPageContentStream cs = new PDPageContentStream(pdDoc, page))< cs.beginText(); // setting font family and font size cs.setFont(PDType1Font.HELVETICA, 14); // Text color in PDF cs.setNonStrokingColor(Color.BLUE); // set offset from where content starts in PDF cs.newLineAtOffset(20, 750); cs.showText("Adding image to PDF Example"); cs.newLine(); cs.endText(); cs.drawImage(image, 20, 450, 350, 200); >// save and close PDF document pdDoc.save(DEST); pdDoc.close(); > catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> >
When you encrypt a PDF you can configure two things-
1. Access Permissions- Using PDFBox the access permissions to the user are configured using AccessPermission class. Following permissions can be given for an encrypted PDF document.
2. Setting the password- For setting password protection, StandardProtectionPolicy class is used in PDFBox. You can pass AccessPermission class object, owner password and user password as arguments to the constructor of this class.
With owner password you can open the PDF without any restrictions on access.
With user password you can open the PDF with restricted access permissions.
In the example an existing PDF is loaded then access permission and password are set for that PDF.
import java.io.File; import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.encryption.AccessPermission; import org.apache.pdfbox.pdmodel.encryption.StandardProtectionPolicy; public class PDFGenerator < public static final String PDF = "F:\\NETJS\\Test\\HelloWorld.pdf"; final static String USER_PASSWORD = "user"; final static String OWNER_PASSWORD = "owner"; public static void main(String[] args) < try < //load an existing PDF PDDocument document = PDDocument.load(new File(PDF)); AccessPermission ap = new AccessPermission(); /** Setting access permissions */ // Can't print PDF ap.setCanPrint(false); // Can't copy PDF ap.setCanExtractContent(false); /** Access permissions End */ /** Setting password */ StandardProtectionPolicy sp = new StandardProtectionPolicy(OWNER_PASSWORD, USER_PASSWORD, ap); sp.setEncryptionKeyLength(128); document.protect(sp); document.save(PDF); document.close(); >catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> >
You can use PDFTextStripper class to extract text from PDF using PDFBox.
import java.io.File; import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.text.PDFTextStripper; public class PDFGenerator < public static final String PDF = "F:\\NETJS\\Test\\MultiLine.pdf"; public static void main(String[] args) < try < //load an existing PDF PDDocument document = PDDocument.load(new File(PDF)); PDFTextStripper textStripper = new PDFTextStripper(); // Get count of total pages in the PDF int pageCount = document.getNumberOfPages(); //set start page for extraction textStripper.setStartPage(1); // set last page for extraction textStripper.setEndPage(pageCount); // extract text from all the pages between start and end String text = textStripper.getText(document); System.out.println(text); document.close(); >catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> >
First line added to the PDF This is a long text which spans multiple lines, this text checks if the line is changed as per the allotted width in the PDF or not.The Apache PDFBox library is an open source tool written in Java for working with PDF documents.
You can use PDResources class to extract image from PDF using PDFBox.
Using PDResources class you can get all the resources available at page level. You can iterate over those resources to check if any of the resource is image, if yes then copy that image to the specified location.
import java.io.File; import java.io.IOException; import javax.imageio.ImageIO; import org.apache.pdfbox.cos.COSName; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDResources; import org.apache.pdfbox.pdmodel.common.PDStream; import org.apache.pdfbox.pdmodel.graphics.PDXObject; import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject; public class PDFGenerator < public static final String PDF = "F:\\NETJS\\Test\\image.pdf"; public static final String IMG_LOCATION = "F:\\NETJS\\Test"; public static void main(String[] args) < try < //load an existing PDF PDDocument document = PDDocument.load(new File(PDF)); // get resources for a page PDResources pdResources = document.getPage(0).getResources(); int i = 0; for(COSName csName : pdResources.getXObjectNames()) < PDXObject pdxObject = pdResources.getXObject(csName); // if resource obj is of type image if(pdxObject instanceof PDImageXObject) < PDStream pdStream = pdxObject.getStream(); PDImageXObject image = new PDImageXObject(pdStream, pdResources); i++; // image storage location and image name File imgFile = new File(IMG_LOCATION+"\\Pdfimage"+i+".png"); ImageIO.write(image.getImage(), "png", imgFile); >> document.close(); > catch(IOException e) < // TODO Auto-generated catch block e.printStackTrace(); >> >
That's all for this topic Creating PDF in Java Using Apache PDFBox. If you have any doubt or any suggestions to make please drop a comment. Thanks!
Related Topics
You may also like-