dedoc.converters

class dedoc.converters.AbstractConverter(*, config: dict | None = None, converted_extensions: Set[str] | None = None, converted_mimes: Set[str] | None = None)[source]

This class provides the common methods for all converters: can_convert() and convert().

__init__(*, config: dict | None = None, converted_extensions: Set[str] | None = None, converted_mimes: Set[str] | None = None) None[source]
Parameters:
  • config – configuration of the converter, e.g. logger for logging

  • converted_extensions – set of supported files extensions with a dot, for example {.doc, .pdf}

  • converted_mimes – set of supported MIME types of files

can_convert(file_path: str | None = None, extension: str | None = None, mime: str | None = None, parameters: dict | None = None) bool[source]

Check if this converter can convert file. You should provide at least one of the following parameters: file_path, extension, mime.

Parameters:
  • file_path – path of the file to convert

  • extension – file extension, for example .doc or .pdf

  • mime – MIME type of file

  • parameters – any additional parameters for the given document

Returns:

the indicator of possibility to convert this file

abstract convert(file_path: str, parameters: dict | None = None) str[source]

Convert the given file to another format if it’s possible. This method can only be called on appropriate files, ensure that can_convert() is True for the given file. If the file format is unsupported the ConversionException will be thrown.

Parameters:
  • file_path – path of the file to convert

  • parameters – parameters of converting, see Parameters description for more details

Returns:

path of converted file if conversion was executed

class dedoc.converters.ConverterComposition(converters: List[AbstractConverter])[source]

This class allows to convert any document into the predefined list of formats according to the available list of converters. The list of converters is set via the class constructor. The first suitable converter is used (the one whose method can_convert() returns True), so the order of converters is important.

__init__(converters: List[AbstractConverter]) None[source]
Parameters:

converters – the list of converters that have methods can_convert() and convert(), they are used for files converting into specified formats

convert(file_path: str, parameters: dict | None = None, extension: str | None = None, mime: str | None = None) str[source]

Convert file if there is the converter that can do it. If there isn’t any converter that is able to convert the file, it isn’t changed.

Parameters:
  • file_path – path of the file to convert

  • parameters – parameters of converting, see Parameters description for more details

  • extension – file extension, for example .doc or .pdf

  • mime – MIME type of file

Returns:

path of converted file if conversion was executed else path of the original file

class dedoc.converters.BinaryConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts image-like documents with mime=application/octet-stream into PNG. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

can_convert(file_path: str | None = None, extension: str | None = None, mime: str | None = None, parameters: dict | None = None) bool[source]

Checks if the document is image-like (e.g. it has .bmp, .jpg, .tiff, etc. extension) and has mime=application/octet-stream.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the image-like and application/octet-stream documents into files with .png extension.

class dedoc.converters.DocxConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts docx-like documents (.doc, .rtf, .odt) into DOCX using the soffice application. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the docx-like documents into files with .docx extension using the soffice application.

class dedoc.converters.ExcelConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts xlsx-like documents (.xls, .ods) into XLSX using the soffice application. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the xlsx-like documents into files with .xlsx extension using the soffice application.

class dedoc.converters.PptxConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts pptx-like documents (.ppt, .odp) into PPTX using the soffice application. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the pptx-like documents into files with .pptx extension using the soffice application.

class dedoc.converters.PDFConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts pdf-like documents (.djvu) into PDF using the ddjvu application. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the pdf-like documents into files with .pdf extension using the ddjvu application.

class dedoc.converters.PNGConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts image-like (.bmp, .jpg, .tiff, etc.) documents into PNG. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the image-like documents into files with .png extension.

class dedoc.converters.TxtConverter(*, config: dict | None = None)[source]

Bases: AbstractConverter

Converts txt-like documents (.xml) into TXT by simple renaming. Look to the AbstractConverter documentation to get the information about the methods’ parameters.

convert(file_path: str, parameters: dict | None = None) str[source]

Convert the txt-like documents into files with .txt extension by renaming it.