It's completely fine to use, but I find it to be a somewhat bloated program with lots of features that you may never need or want to use. Adobe created the PDF standard and its program is certainly the most popular free PDF reader out there. Most people head right to Adobe Acrobat Reader when they need to open a PDF. PDF files always look identical on any device or operating system. The reason PDF is so widely popular is that it can preserve original document formatting. Success = AcroXAVDoc.The Portable Document Format (PDF) is a universal file format that comprises characteristics of both text documents and graphic images which makes it one of the most commonly used file types today. 'returns true if conversion was successful (based on whether `Open` succeeded or not)įunction ConvertPdf2(pdfPath As String, textPath As String) As Boolean Success = ConvertPdf2(pdfPath, fileRoot & c.Offset(0, -1).Value & ".txt")Ĭ.Interior.Color = IIf(success, vbGreen, vbRed)Ĭ.Offset(0, 1).Value = IIf(success, "OK", "PDF not openable") Private Declare Function URLDownloadToFile Lib "urlmon" _ĭim c As Range, fileRoot As String, pdfPath As String, success As BooleanĬ.Offset(0, 1).Value = "No PDF returned" 'flag in col C JsObj.SaveAs textPath, "-text"įunction DownloadFile(sURL, sSaveAs) As BooleanĭownloadFile = (URLDownloadToFile(0, sURL, sSaveAs, 0, 0) = 0) PdfPath = fileRoot & "PDF_" & c.Offset(0, -1).Value & ".pdf"ĪcroXAVDoc.Open pdfPath, "Acrobat" & c.Offset(0, -1).Value & ".txt" Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" _Īlias "URLDownloadToFileA" (ByVal pCaller As LongPtr, ByVal szURL As String, _īyVal szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As LongPtr) As LongĪlias "URLDownloadToFileA" (ByVal pCaller As Long, ByVal szURL As String, _īyVal szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As Long) As Longĭim c As Range, fileRoot As String, pdfPath As StringįileRoot = ws.Range("C2").Value 'read this onceįor Each c In ws.Range("B2:B" & ws.Cells(unt, "B").End(xlUp).row).Cells Update: Attempting to add Error handling Highlighting I think if I can achieve this, then I will be able to work out how to merge this with the previous post so that it can loop through a given number of URLs to generate the text files as well as handle the non-unicofe characters. Download the PDF from a List of URLs/Open the PDF to a specified folder (if necessary).Set AcroXAVDoc = CreateObject("AcroExch.AVDoc") Set AcroXApp = CreateObject("AcroExch.App") Sub convertpdf2() Dim AcroXApp As Acrobat.AcroApp Set fileStream = fso.CreateTextFile(filePath, overwrite:=True, Unicode:=True)Īs for developing this script, I have the following from what I have found online, which now successfully performs the conversion using Acrobat for the previously difficult PDF so Good start. On Error Resume Next 'ignore error if no document.įilePath = fileRoot & c.Offset(0, -1).Value & ".txt" 'filename from ColA If LCase(url) Like "http?:*" Then 'has a URL If Right(fileRoot, 1) "\" Then fileRoot = fileRoot & "\" 'ensure terminating \įor Each c In ws.Range("B2:B" & ws.Cells(Rows.Count, "B").End(xlUp).row).Cells Set ws = Worksheets("Data") 'use a specific worksheet referenceįileRoot = ws.Range("D2").Value 'read this once Set oWd = CreateObject("word.application") M Code from the previous answer: Sub Tester()ĭim fileStream As TextStream, ws As Worksheetĭim oWd As Object, oDoc As Object, c As Range, fileRoot As String Ideally, It would work as the above code but if this is necessary, then so be it. I think I may need to introduce some steps to download the PDF first in the folder and then for each convert those. What I am unsure of, however, is whether Adobe can open a PDF from a URL in the same way word can. So, for this problem, I wish to modify the following code to swap Word for Abode to perform the conversion. I hope this will enable more accurate text conversion but will need help to develop this script. is an example where the generated PDF text file has text missing.Īs advised, I have downloaded a version of Adobe Acrobat (Adobe Acrobat XI Standard), which enables Adobe Acrobat 10.0 Type Library Reference (and a bunch of others). However, I have come across one instance where it fails as it recognises the text on the first page as an image therefore, this text is omitted in the generated text file.Īs an example of a truncated text file generated: This is a much better improvement than using power query. The VBA solution in the previous post uses Word to open the PDF Url, and from this generate a text file. I am attempting to iterate over a series of URLs presented in excel and generate complete text files for each.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |