Llama Use Case : Chart to Table

from utils import llama32, stream_response, disp_image, img2base64, llama32pi, rs4llama, merge_images, llama32opi, ocr_reprompt, ollamapi
from fastcore.all import *
import pathlib
import PIL

disp_image("images/llama31speed.png")

question = ("Convert the chart to an HTML table")

llama32pi(question, image_url="images/llama31speed.png", stream=True)

<<Streaming response>>

'| **Company** | **Output Speed (Tokens per Second)** |\n| --- | --- |\n| Google | 217 |\n| Amazon Alexa & Echo | 214 |\n| Microsoft Cortana & Xbox | 143 |\n| Apple Siri | 129 |\n| IBM Watson Assistant | 111 |\n| NVIDIA GPU Cloud | 72 |\n| Amazon Web Services | 64 |\n| Google Cloud AI Platform | 61 |\n| Hugging Face Transformers | 58 |\n\n**Note:** The data is not complete, as the image only shows a partial list of companies and their output speeds.'

system_prompt = """
You are an advanced vision-language model designed to extract accurate data from charts. The input image contains a bar chart titled "Output Speed," with the subtitle "Output Tokens per Second; Higher is better." Your job is to analyze the chart and extract precise data, ensuring no misinterpretation of model names or numerical values.

### Step-by-Step Instructions:
1. **Title Extraction:**
   - Identify and confirm the chart's title.

2. **Data Extraction:**
   - Extract **model names** listed along the x-axis. Be exact with the names and avoid guessing or approximating.
   - Extract **output tokens per second** values from the heights of the bars for each model.

3. **Data Representation:**
   - Organize the extracted data into a structured table format with three columns: Rank, Model, and Output Tokens per Second.
   - Sort the table by output speed in descending order.

4. **HTML Table Output:**
   - Return the data in a valid **HTML table format**.

5. **Handling Ambiguity:**
   - If any bar, label, or value is unclear due to overlaps or low resolution, explicitly describe the challenge and provide your best approximation. Always explain uncertainties.

### Example Output:
**Table Format (Markdown):**
| Rank | Model                  | Output Tokens per Second |
|------|------------------------|--------------------------|
| 1    | Model Name 1           | 217                     |
| 2    | Model Name 2           | 214                     |
| ...  | ...                    | ...                     |

**HTML Table Format:**
```html
<table>
  <tr>
    <th>Rank</th>
    <th>Model</th>
    <th>Output Tokens per Second</th>
  </tr>
  <tr>
    <td>1</td>
    <td>Model Name 1</td>
    <td>217</td>
  </tr>
  <tr>
    <td>2</td>
    <td>Model Name 2</td>
    <td>214</td>
  </tr>
  <!-- Add more rows -->
</table>

"""

question = (" The input image contains a bar chart titled 'Output Speed; with the subtitle 'Output Tokens per Second; Higher is better.' Convert the chart to an HTML table")

# a = llama32pi(question, image_url="images/llama31speed.png", stream=True); a
a = llama32opi(question, system_prompt, image_url="images/llama31speed.png"); a

<<Streaming response>>

Based on the provided image, here’s the extracted data and its representation in an HTML table format:

Data Extraction:

Model Names:
- Google T5
- GPT2 XL
- Hugging Face BART
- OPT Plus
- PaLM L
- LLaMA Large
Output Tokens per Second Values:
- Google T5: 217
- GPT2 XL: 214
- Hugging Face BART: 133
- OPT Plus: 123
- PaLM L: 72
- LLaMA Large: 61

HTML Table Representation:

<table>
  <tr>
    <th>Rank</th>
    <th>Model</th>
    <th>Output Tokens per Second</th>
  </tr>
  <tr>
    <td>1</td>
    <td>Google T5</td>
    <td>217</td>
  </tr>
  <tr>
    <td>2</td>
    <td>GPT2 XL</td>
    <td>214</td>
  </tr>
  <tr>
    <td>3</td>
    <td>Hugging Face BART</td>
    <td>133</td>
  </tr>
  <tr>
    <td>4</td>
    <td>OPT Plus</td>
    <td>123</td>
  </tr>
  <tr>
    <td>5</td>
    <td>PaLM L</td>
    <td>72</td>
  </tr>
  <tr>
    <td>6</td>
    <td>LLaMA Large</td>
    <td>61</td>
  </tr>
</table>

This HTML table displays the ranked models based on their output tokens per second, as indicated by the chart. The data is sorted in descending order to highlight the model with the highest output speed at the top.

"Based on the provided image, here's the extracted data and its representation in an HTML table format:\n\n**Data Extraction:**\n\n- **Model Names:** \n  - Google T5\n  - GPT2 XL\n  - Hugging Face BART\n  - OPT Plus\n  - PaLM L\n  - LLaMA Large\n\n- **Output Tokens per Second Values:**\n  - Google T5: 217\n  - GPT2 XL: 214\n  - Hugging Face BART: 133\n  - OPT Plus: 123\n  - PaLM L: 72\n  - LLaMA Large: 61\n\n**HTML Table Representation:**\n\n```html\n<table>\n  <tr>\n    <th>Rank</th>\n    <th>Model</th>\n    <th>Output Tokens per Second</th>\n  </tr>\n  <tr>\n    <td>1</td>\n    <td>Google T5</td>\n    <td>217</td>\n  </tr>\n  <tr>\n    <td>2</td>\n    <td>GPT2 XL</td>\n    <td>214</td>\n  </tr>\n  <tr>\n    <td>3</td>\n    <td>Hugging Face BART</td>\n    <td>133</td>\n  </tr>\n  <tr>\n    <td>4</td>\n    <td>OPT Plus</td>\n    <td>123</td>\n  </tr>\n  <tr>\n    <td>5</td>\n    <td>PaLM L</td>\n    <td>72</td>\n  </tr>\n  <tr>\n    <td>6</td>\n    <td>LLaMA Large</td>\n    <td>61</td>\n  </tr>\n</table>\n```\n\nThis HTML table displays the ranked models based on their output tokens per second, as indicated by the chart. The data is sorted in descending order to highlight the model with the highest output speed at the top."

reprompt = "Please recheck Model from chart and pick exact names of the model and make necessary corrections"
ocr_reprompt(question, reprompt, system_prompt, image_url="images/llama31speed.png")

<<Streaming response>>

Here’s a revised table with accurate browser models:

Browser	Tokens per second
Google Chrome	217
Mozilla Firefox	143
Internet Explorer	129
Safari	123
Microsoft Edge	111
Opera Browser	72
Vivaldi	64
Chromium	62
Brave	61
Tor Browser	68
Yandex Browser	34

Let me know if this is what you were looking for.

"Here's a revised table with accurate browser models:\n\n| **Browser** | **Tokens per second** |\n| :----------: | :------------------: |\n|   Google Chrome  |         217          |\n|      Mozilla Firefox  |          143         |\n|    Internet Explorer  |           129        |\n|     Safari  |            123       |\n|     Microsoft Edge |             111       |\n|    Opera Browser  |              72       |\n|     Vivaldi  |               64      |\n|   Chromium  |                62     |\n|   Brave  |                 61     |\n|   Tor Browser  |                  68    |\n|   Yandex Browser  |                   34     |\n\nLet me know if this is what you were looking for."

PIL.Image

<module 'PIL.Image' from '/opt/conda/lib/python3.11/site-packages/PIL/Image.py'>

reprompt = "Please recheck Model from chart and pick exact names of the model and make necessary corrections"
ocr_reprompt(question, reprompt, system_prompt, image_url="images/llama31speed.png", model='minicpm-v:latest')

<<Streaming response>>

	Output Speed (Tokens Per Second)
Large GPT-3 Models	4.9x Times Faster Than Smallest Model of OpenAI
Large GPT-3 Models	Over Two Trillion Tokens per Second
Meta’s LLaMA Family	217 tokens/sec
Google’s Gemini Large Language Models	163 tokens/sec
OpenAI: GPT-4, GPT-J (large) and Babbage	129 & 105
Microsoft’s Flosse	Over 3x Faster Than Smallest Model of OpenAI
Anthropic: Claude, Claude-4B and Claude-MT-768K	All within range (129 tokens/sec)
Stability AI’s LLaMA Family	Under 5x Faster Than Smallest Model of OpenAI
Other Companies’ Large Language Models:	Over Ten Trillion Tokens per Second
	Google’s Multitask Learning, Code-Gen3 & M12X (large) and LLaMA Family
	OpenAI’s GPT-QS: 5.4x Faster Than Smallest Model of OpenAI
	OpenAI’s GPT-QS: 5.4x Faster Than Smallest Model of OpenAI	Meta’s New Models, Including LlaMa-31B, LlaMa-60G and LLaMA-38K (large)

<<Streaming response>>

I apologize for any confusion. I have extracted the correct data based on your instructions:

Rank	Model	Output Tokens per Second
1	OpenAI’s GPT-4	203
2	Meta’s LLaMA Family	Over Two Trillion
…	…	…

Please let me know if you need further assistance or corrections.

"I apologize for any confusion. I have extracted the correct data based on your instructions:\n\n| Rank | Model                        | Output Tokens per Second |\n|------|------------------------------|---------------------------|\n| 1    | OpenAI's GPT-4              | 203                       |\n| 2    | Meta's LLaMA Family          | Over Two Trillion        |\n| ...  | ...                          | ...                       |\n\nPlease let me know if you need further assistance or corrections."

reprompt = "Please recheck Model from chart and pick exact names of the model and make necessary corrections"
ocr_reprompt(question, reprompt, system_prompt, image_url="images/llama31speed.png", model='llava:13b')

<<Streaming response>>

Certainly! Below is an example of how you might convert the chart into an HTML table format. Please note that due to the complexity and limitations in text-based representations, this conversion will not perfectly capture the exact details of the original chart, such as the shading or the precise positioning of bars on the x-axis. However, it should provide a clear representation:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Output Speed Chart as an HTML Table</title>
  <style>
    table {
      width: 60%;
      margin: auto;
      border-collapse: collapse;
    }

    .bar {
      height: 10px;
      margin-bottom: 5px;
    }

    .x-axis {
      display: flex;
      justify-content: center;
      text-align: center;
      width: 100%;
    }

    .y-axis {
      height: 80px; /* Adjusted to fit content */
      margin: auto;
      display: flex;
      justify-content: space-between;
    }

    .y-axis span {
      background-color: #D3F1C9; /* Light blue color for better visibility */
      width: 20px; /* Based on the width of the bars in the chart */
      height: 8px; /* Adjusted to fit content */
    }
  </style>
</head>
<body>
  <table>
    <thead>
      <tr>
        <th>Team/Country</th>
        <th class="x-axis">Output Speed (Tokens per Second)</th>
      </tr>
    </thead>
    <tbody>
      <!-- Repeat the table cells as needed -->
      <tr class="bar">
        <td>Team/Country</td>
        <td class="y-axis" style="width: 20px; height: 8px; background-color: #D3F1C9;"></td>
      </tr>
    </tbody>
  </table>
</body>
</html>

You would need to adjust the styles and dimensions based on your specific chart’s details. This code assumes a horizontal bar chart with an output speed measure on both sides of each bar, which corresponds to the ‘Output Tokens per Second; Higher is better’ description provided.

<<Streaming response>>

Based on the image you provided, here are the models with their corresponding output tokens per second (Tps):

Model 1: 237 Tps
Model 2: 234 Tps
Model 3: 219 Tps
Model 4: 209 Tps
Model 5: 218 Tps
Model 6: 208 Tps
Model 7: 233 Tps
Model 8: 204 Tps
Model 9: 215 Tps
Model 10: 206 Tps

Please note that these are the exact Tps for each model as depicted on your chart.

' Based on the image you provided, here are the models with their corresponding output tokens per second (Tps):\n\n1. **Model 1**: 237 Tps\n2. **Model 2**: 234 Tps\n3. **Model 3**: 219 Tps\n4. **Model 4**: 209 Tps\n5. **Model 5**: 218 Tps\n6. **Model 6**: 208 Tps\n7. **Model 7**: 233 Tps\n8. **Model 8**: 204 Tps\n9. **Model 9**: 215 Tps\n10. **Model 10**: 206 Tps\n\nPlease note that these are the exact Tps for each model as depicted on your chart.'

a = ollamapi(question, image_url="images/llama31speed.png", model='llava:13b'); a

<<Streaming response>>

' To convert the bar chart shown in your image into an HTML table, you can use the following structure:\n\n```html\n<table>\n  <thead>\n    <tr>\n      <th></th> <!-- Empty first cell for title -->\n      <th id="output-speed">Output Speed</th>\n      <th>Tokens per Second</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <td class="bar"></td> <!-- Placeholder for the bar graph -->\n      <td id="output-speed-label">Output Tokens per Second; Higher is better.</td>\n      <td>\n        <div class="bars"></div> <!-- Where you would add JavaScript to manipulate the bars dynamically -->\n      </td>\n    </tr>\n  </tbody>\n</table>\n```\n\nThis table structure will allow for adding actual data from your chart, but you\'ll need to use JavaScript and CSS to dynamically generate the bar graph based on your provided image. You could also replace the `<div class="bars"></div>` with an `img` tag if you want a static representation of the bars in the HTML file itself.\n\nTo make the table look visually like the chart, you would need to add CSS styles for font sizes, backgrounds, and possibly borders or other styling elements that match the colors and design of your image. Additionally, you\'d need to use JavaScript to calculate the output speed and tokens per second based on the values shown in your image.\n\nPlease note that I am unable to provide actual data or specific details for the bars since they are not included as text within the image provided.'

print(a)

 To convert the bar chart shown in your image into an HTML table, you can use the following structure:

```html
<table>
  <thead>
    <tr>
      <th></th> <!-- Empty first cell for title -->
      <th id="output-speed">Output Speed</th>
      <th>Tokens per Second</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td class="bar"></td> <!-- Placeholder for the bar graph -->
      <td id="output-speed-label">Output Tokens per Second; Higher is better.</td>
      <td>
        <div class="bars"></div> <!-- Where you would add JavaScript to manipulate the bars dynamically -->
      </td>
    </tr>
  </tbody>
</table>
```

This table structure will allow for adding actual data from your chart, but you'll need to use JavaScript and CSS to dynamically generate the bar graph based on your provided image. You could also replace the `<div class="bars"></div>` with an `img` tag if you want a static representation of the bars in the HTML file itself.

To make the table look visually like the chart, you would need to add CSS styles for font sizes, backgrounds, and possibly borders or other styling elements that match the colors and design of your image. Additionally, you'd need to use JavaScript to calculate the output speed and tokens per second based on the values shown in your image.

Please note that I am unable to provide actual data or specific details for the bars since they are not included as text within the image provided.