Extracting Elevation Data from PDF files (Contours, Spot Levels etc...) |
01-27-2017, 02:50 PM
(This post was last modified: 01-30-2017, 10:51 AM by Ted Woods.
Edit Reason: Spelling correction
)
I have been looking into the process of extracting elevation data from PDF files into CAD data for use Kubla's takeoff module.
There are a number of issues that come up immediately as PDF files are not designed to store technical meta data. So technical complications often arise during the process. It should not be attempted with the expectation of perfect results every time. However it is a huge time saver potentially. I have identified the following as items we could extract :
Able2Extract Standard www.investintech.com Able2Extract Pro www.investintech.com PDF2DWG www.dotsoft.com Print2CAD http://www.backtocad.com/ Has anyone else got experience with these? To me it seems Print2CAD or Back2CAD as it is also known are the market leaders in this area. They do regular seminars and can do things like convert dashed lines into CAD polylines and extract text using Optical Character Recognition (OCR).
It has been my understanding that vectors contain unique information that can tell the software which vector is next in line when importing. One of the problems I see is which way did the engineer draw the line or what type of line did they use can have a large impact on importing linework. I have seen Ghostscript, VectorDraw, and VeryPDF used too.
(06-09-2017, 03:11 PM)Digger662 Wrote: It has been my understanding that vectors contain unique information that can tell the software which vector is next in line when importing. One of the problems I see is which way did the engineer draw the line or what type of line did they use can have a large impact on importing linework. I have seen Ghostscript, VectorDraw, and VeryPDF used too. Hi Digger662 Welcome to the forums and thanks for your input. I have not heard of those three packages you mentioned apart from Ghostscript which is interesting as it is actually free to download and use I think. I will download it and give it a go with some site plans to see how effective it is. Currently I have been using Inkscape which is really hit and miss. With complicated data it just does not export to a DXF at all. However we would not be able to use directly in Kubla Cubed without paying because the license forbids commercial distribution I think. I agree with you about the problems with the way the engineer has defined the lines. Basically PDF files were never intended to be used this way so it is really difficult to have a consistent workflow to get the data in without a lot of technical understanding on the part of the user. I bit later on I am going to try to publish a blog post about how to convert PDF to DXF and for some users with CAD expertise hopefully this will be helpful. However we are going to be creating automatic line extraction tools in Kubla Cubed in the long term so hopefully all this complexity won't be necessary. (06-12-2017, 09:34 AM)Ted Woods Wrote:(06-09-2017, 03:11 PM)Digger662 Wrote: It has been my understanding that vectors contain unique information that can tell the software which vector is next in line when importing. One of the problems I see is which way did the engineer draw the line or what type of line did they use can have a large impact on importing linework. I have seen Ghostscript, VectorDraw, and VeryPDF used too. I think if the contour vectors can be selected from the PDF and imported with a zero value for elevation then allow the contours to be selected one by one and change the elevation would be a significant improvement to the current workflow of tracing contours. Maybe allow individual vectors to be deleted, broken or trimmed if they are imported improperly. (09-18-2018, 09:43 PM)AggieBQ86 Wrote:(06-12-2017, 09:34 AM)Ted Woods Wrote:(06-09-2017, 03:11 PM)Digger662 Wrote: It has been my understanding that vectors contain unique information that can tell the software which vector is next in line when importing. One of the problems I see is which way did the engineer draw the line or what type of line did they use can have a large impact on importing linework. I have seen Ghostscript, VectorDraw, and VeryPDF used too. Hi AggieBQ86 Yes that is what we planned to do. Effectively allow the user to pick vectors out of the site plan to use as a contour line and then enter the elevation. It is quite tricky though in someways. PDF files were never designed to store CAD data they are a print format so there are a number of complications. Have you tried the other techniques I mentioned above? It would be worth experimenting with InkScape to see if you can convert the PDF into a CAD file (there are tutorials online). Then scale it, delete all the data apart form Contour Lines and import into Kubla Cubed. I have had limited success with this method. The last version of InkScape I had seemed to load the PDF files OK but then crash when converting, they might have fixed things now though. It is worth getting the latest version and giving it a go. Products like Back2CAD claim to be able to extract contour lines and even turn dashed lines into solid polylines. It is not a free product but if you do a lot of take-off it might be worth a look. I have had one report of this working for a user, but of course there were no elevation details so you would have to add that in manually in either CAD or Kubla Cubed. Let us know how you get on. (09-18-2018, 09:43 PM)AggieBQ86 Wrote:(06-12-2017, 09:34 AM)Ted Woods Wrote:(06-09-2017, 03:11 PM)Digger662 Wrote: It has been my understanding that vectors contain unique information that can tell the software which vector is next in line when importing. One of the problems I see is which way did the engineer draw the line or what type of line did they use can have a large impact on importing linework. I have seen Ghostscript, VectorDraw, and VeryPDF used too. Hi AggieBQ86, Just a quick update, the latest release- Kubla Cubed 2021 has now been launched!! Thank you for your PDF vector extraction suggestions - the following have now been implemented: - Contour PDF vector extraction (importing as 0.00 values) - Join Tool - Split Tool - Set Multiple Elevations (SME) Tool Plus, there are many other updates - you can see these in our video What's New in 2021? If you are a Subscription Licence Holder, you can upgrade today if you follow the instructions when you open Kubla Cubed in your desktop. Please keep your new feature suggestions coming! Kate
<p><br></p>
Hey there! I see you've been doing some interesting work with extracting elevation data from PDF files. Although I'm not familiar with the software you mentioned, it's great that you're experimenting with different options to get the best results. It's true that PDF files can be tricky when it comes to technical metadata, so it's important to have the right tools to handle the job. Have you considered trying out Smart Engines SDK for OCR? It might be able to help you with extracting text from PDF files for elevation data. Anyway, keep up the great work, and don't hesitate to ask for help or advice on the forum!
Although I am not an expert to be able to solve this problem for you. But I really hope someone can guide you here!
Minesweeper (03-18-2023, 02:46 PM)kbkhan Wrote: Hey there! I see you've been doing some interesting work with extracting elevation data from PDF files. Although I'm not familiar with the software you mentioned, it's great that you're experimenting with different options to get the best results. It's true that PDF files can be tricky when it comes to technical metadata, so it's important to have the right tools to handle the job. Have you considered trying out Smart Engines SDK for OCR? It might be able to help you with extracting text from PDF files for elevation data. Anyway, keep up the great work, and don't hesitate to ask for help or advice on the forum! Hi. Thanks for the tip. We are a little way off investigating Optical Character recognition at this stage, but when we get onto it we'll take a look. |
Users browsing this thread: |
5 Guest(s) |