![]() doc, use ""ĪvDoc.Close(True) # We want this to close Acrobat, as otherwise Acrobat is going to refuse processing any further files after a certain threshold of open files are reached (for example 50 PDFs)Īssert(5 = len(sys.argv)), sys.argv #, ,, , JsObject.SaveAs(dst, "") # NOTE: If you want to save the file as a. # Here you can save as many other types by using, for instance: "" ![]() For details, see the JavaScript for Acrobat API Reference" # Adobe documentation says "For that reason, you must rely on the documentation to know what functionality is available through the JSObject interface. ![]() # try importing scandir and if found, use it as it's a few magnitudes of an order faster than stock os.walkĭef acrobat_extract_text(f_path, f_path_out, f_basename, f_ext):ĪvDoc = Dispatch("AcroExch.AVDoc") # Connect to Adobe AcrobatĪssert(ret) # FIXME: Documentation says "-1 if the file was opened successfully, 0 otherwise", but this is a bool in practise?ĭst = os.path.join(f_path_out, ''.join((f_basename, f_ext))) The following should be a complete piece of code that converts a set of PDFs to DOCX: # gets all files under ROOT_INPUT_PATH with FILE_EXTENSION and tries to extract text from them into ROOT_OUTPUT_PATH with same filename as the input file but with INPUT_FILE_EXTENSION replaced by OUTPUT_FILE_EXTENSIONįrom import ERRORS_BAD_CONTEXT ![]() I did something very similar using WinPython 圆4 2.7.6.3 and Acrobat X Pro and used the JSObject interface to convert PDFs to DOCX.
0 Comments
Leave a Reply. |