PDF Data Extraction In Linux
Firstly, install the necessary utilities:
- Ubuntu:
sudo apt-get install poppler-utils
- Fedora:
sudo yum install poppler-utils
For other Linux distributions, search for poppler-utils in your package manager.
This command will extract all the images from "pdffile.pdf" and put them in the /home/<username>/pdfimages/ directory:
pdfimages -j pdffile.pdf ~/pdfimages/
pdftotext pdffile.pdf
Please note, that this command will only extract real text. If your PDF contains images with text printed on them then this won't work.
Other Articles
- Lens Toggle Quickly Enable / Disable Unity Lenses Without Removing Them
- How To Get Rid Of Internal System Error Apport Popups In Ubuntu
- Qreator Offers Fast Creation of Qr Codes in Ubuntu
- Fix GNOME Panels On Top Of Fullscreen Flashplayer Bug In Ubuntu 12 04 Classic GNOME Session
- How To Install Oracle Java 7 In Debian Via Repository
- MediaBox Reloaded
- Unity Window Quicklists Switch Between Open Windows Via Quicklists
