Internal Extractors

File Systems



A great deal if not the majority of company information is still stored in files sitting on hard drives on PCs and network drives. This is fine so long as one knows where the data is, but as we all know it can become difficult to even locate files on our own systems. As hard drive space increases logarithmically and prices continue to sink and data continues to flood in, dealing with all of this information is outstripping any manual structuring or ordering that can be done. To combat this trend, the File System Extractor was developed.

  • The File System Extractor can crawl file systems quickly and easily.
  • It is easy to setup and administrate with a unified graphical user interface even across multiple paths, drives, and systems.
  • It handles all of the most commonly used file types from Microsoft, OpenOffice, PDF, and many more.
  • It can then deliver the files in varied text formats (like XML) to fit the needs of CMSs, Search Engines, or other Content Platforms.

Additionally, the File System Extractor deals with all the details around handling the data that manual intervention is no longer needed. It polls the file systems where the data is to provide updates on changes, deletions, and additions of files. It extracts a wide breadth of valuable meta-data from the files properties. The File System Extractor does all of this to bring you and your organization’s Intellectual Property back to your fingertips.


File Types and Meta-Data Extraction

Often the data around a file is also valuable. This can be the date and time it was saved or modified, the file name, the directory name it is saved in or other specific file properties like author, subject, title, and key words. This information is most often referred to as meta-data. The File System Extractor has the ability to extract this data and store them in fields associated with the main text of a document.

The File System Extractor also handles a large assortment of file types. Here is a list of some of the major ones:

  • Microsoft Word, Excel, PowerPoint
  • Open Office Writer, Calc, Impress
  • TXT, RTF,
  • MSG, and
  • compressed formats like ZIP and RAR (which are handled recursively to extract out the relevant data from each file within the ZIP or similar).

New file types are constantly being added to the import process to assure all kinds of files can be processed.


Supported File Systems

The File System Extractor uses SMB (Server Message Block) protocol to connect to the various file systems. This makes it virtually Operating System independent as Windows, Unix, Linux, Mac, and almost all other Operating Systems support SMB connections.

Connect with us
First Name:*
Last Name:*