Kitz ADSL Broadband Information
adsl spacer  
Support this site
Home Broadband ISPs Tech Routers Wiki Forum
 
     
   Compare ISP   Rate your ISP
   Glossary   Glossary
 
Please login or register.

Login with username, password and session length
Advanced search  

News:

Author Topic: PDF to CSV/Excel spreadsheet conversion.  (Read 5266 times)

renluop

  • Kitizen
  • ****
  • Posts: 3326
PDF to CSV/Excel spreadsheet conversion.
« on: February 17, 2011, 04:01:30 PM »

Hitherto it has been possible to download call deatils in bills to a csv file that can be imported to Excel.
BT in the kindness of their wisdom ::) have removed the facilty, leaving only pdf downloads.

Is there any way by which pdfs may be converted to csvs, as cheap as can be?
Logged

geep

  • Reg Member
  • ***
  • Posts: 452
    • My ST546 Statistics
Re: PDF to CSV/Excel spreadsheet conversion.
« Reply #1 on: February 18, 2011, 12:15:27 AM »

I've converted a BT phone bill from .pdf to Excel via .jpg and OCR, but the results aren't very good.
The tables are preserved, more or less, but £ often comes out as f, and there are lots of other errors.

Use free Imagemagick convert command to get a .jpg for each page.

convert -density 200 phone.pdf phone_%d.jpg

Then I used the OCR software that came with my Canon scanner to OCR each .jpg into .rtf - Omnipage Scansoft.

The table format is preserved intelligently when I open the .rtf in MS Word.
I can then copy & paste from Word to Excel.

I've used this technique very successfully to get my food diary into Word, but it's not nearly as good with the BT Bill.
If you don't have a scanner with OCR software, the recent versions of MS Word - 2003 onwards I think, have OCR capability, but I haven't tried it.

If you're on Linux you should have a whole raft of pdf converters - but those I tried don't keep the document format at all well so the data would need postprocessing. Could be worth a few minutes experimentation:
/usr/bin/pdf2dsc
/usr/bin/pdf2ps
/usr/bin/pdffonts
/usr/bin/pdfimages
/usr/bin/pdfinfo
/usr/bin/pdfopt
/usr/bin/pdfroff
/usr/bin/pdftexi2dvi
/usr/bin/pdftoabw
/usr/bin/pdftohtml
/usr/bin/pdftoppm
/usr/bin/pdftops
/usr/bin/pdftotext

Cheers,
Peter



Logged

renluop

  • Kitizen
  • ****
  • Posts: 3326
Re: PDF to CSV/Excel spreadsheet conversion.
« Reply #2 on: February 18, 2011, 09:38:55 AM »

Something to try, Thanks :)
Logged

tuftedduck

  • Senior Kitizen
  • ******
  • Posts: 29658
  • Router Luvvin Duck
Re: PDF to CSV/Excel spreadsheet conversion.
« Reply #3 on: February 18, 2011, 09:59:23 AM »

This is a good on line freebie.

http://www.pdftoexcelonline.com/default.aspx
Logged

renluop

  • Kitizen
  • ****
  • Posts: 3326
Re: PDF to CSV/Excel spreadsheet conversion.
« Reply #4 on: February 18, 2011, 03:56:49 PM »

Thx once again. :)
Logged

geep

  • Reg Member
  • ***
  • Posts: 452
    • My ST546 Statistics
Re: PDF to CSV/Excel spreadsheet conversion.
« Reply #5 on: February 18, 2011, 07:11:49 PM »

Quote
This is a good on line freebie.

http://www.pdftoexcelonline.com/default.aspx

Tried it. It only extracted 5 of 7 lines of the "Where you called" table from my 8 page phone bill. It omitted several other tables, and all the other text.

But on my Food Diary pdf it did a pretty good job. This pdf is virtually all table - actually 2 tables, with some header and footer text. It captured the first table almost perfectly. The second table on sheet 2, and some other stuff on sheet 3.

Cheers,
Peter
Logged
 

anything