DePlot

DePlot is a multimodal model for single-shot translation of an input image of a plot/chart to a table as output. The output table from DePlot is linearized as a textual sequence in markdown format, with | separating cells and \n, <0x0A>, or [SEP] separating rows, and generated in the left-to-right convention. For utilizing it in a chart question answering feature, the output table can be used to prompt a LLM of your choice (such as Llama, and others) to interpret values from the table.

ChartQA with DePlot overview
Figure 1. ChartQA with DePlot overview
Chart QA Example (arXiv:2212.10505 [cs.CL])
Figure 2. Chart QA Example (arXiv:2212.10505 [cs.CL])

Input

An image of standard format (e.g. JPEG, JPG, PNG etc.) passed to the endpoint as a file.

Output

The output will be text in a markdown tabular format.

A sample raw output may look like the following
Column 1 | Column 2 | Column 3 <0x0A> MEDIAN | value | value <0x0A> value | value | value <0x0A> value | value | value <0x0A> value | value | value ...
By replacing the <0x0A> with a newline \n the output is cleaner
Column 1 | Column 2 | Column 3
MEDIAN | value | value
value | value | value
value | value  | value
...
An output with a title can take on the following format
TITLE | Title_of_Infographic
Column 1 | Column 2 | Column 3
MEDIAN | value | value
value  | value | value
value  | value | value
There may also be NaNs in the output depending on the input and DePlot interpretation
TITLE | Title_of_Infographic
Column 1 | Column 2 | Column 3
value | nan    | value
value | value  | nan
Another possibility is an output with holes instead of NaNs
TITLE | Title_of_Infographic
Column 1 | Column 2 | Column 3
value |        | value
      | value  | value

In some cases, interpolating the missing values led to improved quality of the response from the downstream LLM for ChartQA applications. This, however, may only work for certain use cases. Another consideration for ChartQA applications is the structural nature of the response from DePlot. Filling in the blank spaces in the DePlot response with explicit NaNs helped the qualitative result of the ChartQA in many cases.

Hyperparameters and settings

The hyperparameters and settings for DePlot are described below.

Parameter Definition Allowed values User Adjustable

sequence length

The maximum number of tokens in a sequence that is being processed by the model. The sequence length, in this case, is the sum of the number of tokens input to the model and the number of tokens generated by the model (max_new_tokens).

512

No

max_new_tokens

The maximum number of tokens for the model to generate in its output. You can use this parameter to limit the response to a certain a number of tokens. The generation will stop either when the model stops generating due to </s> or reaches the limit for max tokens to generate. The default value for max_new_tokens is 512.

1 < = Integer < = 512

Yes, via a POST method to endpoint

Make sure the total number of tokens in your prompt plus the value you set for max_new_tokens does not exceed the sequence length of the model. For instance, if you input is 256 tokens, then max_new_tokens cannot be set to more than 256 since the total would be greater than 512 (sequence length of the model). NOTE: An error may not be thrown if this total sequence length is exceed; however, it is likely that the model’s output will be truncated at (sequence_length - input_length) tokens.

Deploy a DePlot endpoint

Follow the steps below to deploy an endpoint using DePlot.

  1. Create a new project or use an existing one.

  2. From a project window, click New endpoint. The Add an endpoint box will open.

  3. Select the following settings to create the endpoint:

    1. Select DePlot from the ML App drop-down.

    2. Select DePlot from the Select model drop-down.

  4. Click Add an endpoint to deploy the endpoint.

    Create Project
    Figure 3. Add an endpoint box
  5. The Endpoint created confirmation will display.

  6. Click View endpoint to open the endpoint details window.

    View endpoint
    Figure 4. Endpoint confirmation
  7. The status will change to Live in a few minutes when the endpoint is ready.

    Endpoint

Please refer to the Usage section for instructions on how to interact with the endpoint.

Usage

Once the endpoint has been created in SambaStudio, the following format(s) can be used to interact with the endpoint.

Request template to pass an image input via a file
curl -k -X POST '<your endpoint url>' \
-H 'key: <your endpoint key>' \
--form 'predict_file=<your image file path>'
i.e.

--form 'predict_file=@<your image file path>'

In order to make a post request to the endpoint passing the parameters as well as the image file, you will need to construct a multipart/form-data request. The following template is how to construct this.
curl -X POST \
     -H 'Content-Type: multipart/form-data' \
     -H 'key:<your endpoint key>' \
     <your endpoint url> \
     -F 'predict_file=@<your image file path>' \
     -F 'params={"max_tokens_to_generate": \
                    {"type":"int","value":"100"} \
                };type=application/json'

At this time, only the multipart/form-data format is supported. Ability to pass in Image in the byte64 format as a json payload is not supported in this release.