LLM Demo Notebook¶

Getting Setup¶

You can run this notebook on the Jupyter Hub machines but you will need to setup an OpenAI account. Alternatively, if you are running on your own computer you can also try to run a model locally.

Step 1. Create an OpenAI account¶

You can create a free account which has some initial free credits by going here:

https://platform.openai.com

You will the need to get an API Key. Save that api key to a local file called openai.key:

In [ ]:
# with open("openai.key", "w") as f:
#     f.write("sk-zPutYourkeyHereYJ")

Step 2. Install Python Tools¶

Uncomment the following line.

In [ ]:
# pip install -U openai langchain langchain-openai

Using OpenAI with LangChain¶

In [ ]:
from langchain_openai import OpenAI
In [ ]:
openai_key = open("openai.key", "r").readline()
llm = OpenAI(openai_api_key=openai_key,
             model_name="gpt-3.5-turbo-instruct")
In [ ]:
llm.invoke("What is the capital of California? Provide a short answer.")
Out[ ]:
'\n\nSacramento'
In [ ]:
for chunk in llm.stream("Write a short song about data science and large language models."):
    print(chunk, end="", flush=True)

Verse 1:
Data science, it's the way
To unlock secrets, every day
With algorithms, we can see
Patterns and insights, so easily

Chorus:
Large language models, they're the key
To understanding, what we can't see
They analyze, and they predict
With data science, we can't restrict

Verse 2:
Millions of words, they can understand
From news articles, to social media trends
They learn and evolve, with every use
Data science, it's a powerful muse

Chorus:
Large language models, they're the key
To understanding, what we can't see
They analyze, and they predict
With data science, we can't restrict

Bridge:
From Siri to Alexa, they're everywhere
Large language models, they're beyond compare
They help us communicate, with machines
Data science, fulfilling our dreams

Chorus:
Large language models, they're the key
To understanding, what we can't see
They analyze, and they predict
With data science, we can't restrict

Outro:
So let's embrace, this technology
Data science and large language models, our destiny
Together they'll unlock, the future's doors
And we'll keep evolving

Running Locally with Ollama and LangChain¶

You can download and install Ollama from:

https://ollama.ai/download

This will run models locally

In [ ]:
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
vicuna = Ollama(
    model="vicuna:7b", 
    #temperature=0,
    callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
)
In [ ]:
vicuna.invoke("What is the capital of California? Answer with only one word.")
Sacramento
Out[ ]:
'\nSacramento'
In [ ]:
vicuna.invoke("Write a short song about data science and large language models.")
Verse 1:
In the world of data, there's a new star
A shining light that's taking us far
It's called data science, it's a magic spell
That brings insights from the depths of hell

Chorus:
Large language models, they're in our way
Helping us to analyze and find new things each day
With algorithms and machine learning too
We can uncover secrets that were once so few

Verse 2:
From natural language processing to visualize
The beauty of data is now just a glance away
With Python and R, we're on our way
To discovering insights that will light the way

Chorus:
Large language models, they're in our way
Helping us to analyze and find new things each day
With algorithms and machine learning too
We can uncover secrets that were once so few

Bridge:
And as we delve deeper into this world of data
We see the power of technology taking hold
But let's not forget the ethics and the laws
That guide us on our journey to discover new jaws

Chorus:
Large language models, they're in our way
Helping us to analyze and find new things each day
With algorithms and machine learning too
We can uncover secrets that were once so few
Out[ ]:
"Verse 1:\nIn the world of data, there's a new star\nA shining light that's taking us far\nIt's called data science, it's a magic spell\nThat brings insights from the depths of hell\n\nChorus:\nLarge language models, they're in our way\nHelping us to analyze and find new things each day\nWith algorithms and machine learning too\nWe can uncover secrets that were once so few\n\nVerse 2:\nFrom natural language processing to visualize\nThe beauty of data is now just a glance away\nWith Python and R, we're on our way\nTo discovering insights that will light the way\n\nChorus:\nLarge language models, they're in our way\nHelping us to analyze and find new things each day\nWith algorithms and machine learning too\nWe can uncover secrets that were once so few\n\nBridge:\nAnd as we delve deeper into this world of data\nWe see the power of technology taking hold\nBut let's not forget the ethics and the laws\nThat guide us on our journey to discover new jaws\n\nChorus:\nLarge language models, they're in our way\nHelping us to analyze and find new things each day\nWith algorithms and machine learning too\nWe can uncover secrets that were once so few"






Data Analytics¶

We can use LLMs to help in analyzing data

In [ ]:
import pandas as pd
tweets = pd.read_json("AOC_recent_tweets.txt")
list(tweets['full_text'][0:10])
Out[ ]:
['RT @RepEscobar: Our country has the moral obligation and responsibility to reunite every single family separated at the southern border.\n\nT…',
 'RT @RoKhanna: What happens when we guarantee $15/hour?\n\n💰 31% of Black workers and 26% of Latinx workers get raises.\n😷 A majority of essent…',
 '(Source: https://t.co/3o5JEr6zpd)',
 'Joe Cunningham pledged to never take corporate PAC money, and he never did. Mace said she’ll cash every check she gets. Yet another way this is a downgrade. https://t.co/DytsQXKXgU',
 'What’s even more gross is that Mace takes corporate PAC money.\n\nShe’s already funded by corporations. Now she’s choosing to swindle working people on top of it.\n\nPeak scam artistry. Caps for cash 💰 https://t.co/CcVxgDF6id',
 'Joe Cunningham already proving to be leagues more decent + honest than Mace seems capable of.\n\nThe House was far better off w/ Cunningham. It’s sad to see Mace diminish the representation of her community by launching a reputation of craven dishonesty right off the bat.',
 'Pretty horrible.\n\nWell, it’s good to know what kind of person she is early. Also good to know that Mace is cut from the same Trump cloth of dishonesty and opportunism.\n\nSad to see a colleague intentionally hurt other women and survivors to make a buck. Thought she’d be better. https://t.co/CcVxgDF6id',
 'RT @jaketapper: .@RepNancyMace fundraising off the false smear that @AOC misrepresented her experience during the insurrection. She didn’t.…',
 'RT @RepMcGovern: One reason Washington can’t “come together” is because of people like her sending out emails like this.\n\nShe should apolog…',
 'RT @JoeNeguse: Just to be clear, “targeting” stimulus checks means denying them to some working families who would otherwise receive them.']




Suppose I wanted to evaluate whether a tweet is attacking someone

In [ ]:
prompt = """
Is the following text making a statement about minimum wage? You should answer either Yes or No.

{text}

Answer:
"""
questions = [prompt.format_map(dict(text=t)) for t in tweets['full_text'].head(20)]

Ask each of the LLMs to answer the questions:

In [ ]:
open_ai_answers = llm.batch(questions)
open_ai_answers
Out[ ]:
['No',
 '\nYes',
 '\nNo',
 '\nNo',
 '\nNo',
 'No',
 '\nNo',
 '\nNo',
 '\nNo',
 '\nYes',
 '\nYes',
 '\nNo',
 '\nNo',
 '\nNo',
 'No',
 '\nNo',
 '\nYes',
 '\nNo',
 '\nYes',
 'No']
In [ ]:
vicuna_answers = vicuna.batch(questions)
vicuna_answers
No.No.
Yes.No.
Yes.No.No.
Yes.
No.No.Yes.No.
No.Yes.No.Yes.No.Yes.No.No.
Out[ ]:
['\nNo.',
 'No.',
 '\nYes.',
 'No.',
 '\nYes.',
 'No.',
 'No.',
 '\nYes.',
 '\nNo.',
 'No.',
 'Yes.',
 'No.',
 '\nNo.',
 'Yes.',
 'No.',
 'Yes.',
 'No.',
 'Yes.',
 'No.',
 'No.']
In [ ]:
pd.set_option('display.max_colwidth', None)
df = pd.DataFrame({"OpenAI": open_ai_answers, 
                   "Vicuna": vicuna_answers,
                   "Text": tweets['full_text'].head(20)})
df["OpenAI"] = df["OpenAI"].str.contains("Y")
df["Vicuna"] = df["Vicuna"].str.contains("Y")
df
Out[ ]:
OpenAI Vicuna Text
0 False False RT @RepEscobar: Our country has the moral obligation and responsibility to reunite every single family separated at the southern border.\n\nT…
1 True False RT @RoKhanna: What happens when we guarantee $15/hour?\n\n💰 31% of Black workers and 26% of Latinx workers get raises.\n😷 A majority of essent…
2 False True (Source: https://t.co/3o5JEr6zpd)
3 False False Joe Cunningham pledged to never take corporate PAC money, and he never did. Mace said she’ll cash every check she gets. Yet another way this is a downgrade. https://t.co/DytsQXKXgU
4 False True What’s even more gross is that Mace takes corporate PAC money.\n\nShe’s already funded by corporations. Now she’s choosing to swindle working people on top of it.\n\nPeak scam artistry. Caps for cash 💰 https://t.co/CcVxgDF6id
5 False False Joe Cunningham already proving to be leagues more decent + honest than Mace seems capable of.\n\nThe House was far better off w/ Cunningham. It’s sad to see Mace diminish the representation of her community by launching a reputation of craven dishonesty right off the bat.
6 False False Pretty horrible.\n\nWell, it’s good to know what kind of person she is early. Also good to know that Mace is cut from the same Trump cloth of dishonesty and opportunism.\n\nSad to see a colleague intentionally hurt other women and survivors to make a buck. Thought she’d be better. https://t.co/CcVxgDF6id
7 False True RT @jaketapper: .@RepNancyMace fundraising off the false smear that @AOC misrepresented her experience during the insurrection. She didn’t.…
8 False False RT @RepMcGovern: One reason Washington can’t “come together” is because of people like her sending out emails like this.\n\nShe should apolog…
9 True False RT @JoeNeguse: Just to be clear, “targeting” stimulus checks means denying them to some working families who would otherwise receive them.
10 True True Amazon workers have the right to form a union.\n\nAnti-union tactics like these, especially from a trillion-dollar company trying to disrupt essential workers from organizing for better wages and dignified working conditions in a pandemic, are wrong. https://t.co/nTDqMUapYs
11 False False RT @WorkingFamilies: Voters elected Democrats to deliver more relief, not less.
12 False False We should preserve what was there and not peg it to outdated 2019 income. People need help!
13 False True If conservative Senate Dems institute a lower income threshold in the next round of checks, that could potentially mean the first round of checks under Trump help more people than the first round under Biden.\n\nDo we want to do that? No? Then let’s stop playing & just help people.
14 False False @iamjoshfitz 😂 call your member of Congress, they can help track it down
15 False True All Dems need for the slam dunk is to do what people elected us to do: help as many people as possible.\n\nIt’s not hard. Let’s not screw it up with austerity nonsense that squeezes the working class yet never makes a peep when tax cuts for yachts and private jets are proposed.
16 True False It should be $2000 to begin w/ anyway. Brutally means-testing a $1400 round is going to hurt so many people. THAT is the risk we can’t afford.\n\nIncome thresholds already work in reverse & lag behind reality. Conservative Dems can ask to tax $ back later if they’re so concerned.
17 False True We cannot cut off relief at $50k. It is shockingly out of touch to assert that $50k is “too wealthy” to receive relief.\n\nMillions are on the brink of eviction. Give too little and they’re devastated. Give “too much” and a single mom might save for a rainy day. This isn’t hard. https://t.co/o14r3phJeH
18 True False Imagine being a policymaker in Washington, having witnessed the massive economic, social, and health destruction over the last year, and think that the greatest policy risk we face is providing *too much* relief.\n\nSounds silly, right?\n\n$1.9T should be a floor, not a ceiling.
19 False False @AndrewYang @TweetBenMax @RitchieTorres Thanks @AndrewYang! Happy to chat about the plan details and the community effort that’s gone into this legislation. 🌃🌎
In [ ]:
prompt = """
Is the following text self promoting? You should answer either Yes or No.

{text}

Answer:
"""
questions = [prompt.format_map(dict(text=t)) for t in tweets['full_text'].head(20)]
open_ai_answers2 = llm.batch(questions)
df2 = pd.DataFrame({"OpenAI": open_ai_answers2, 
                   "Text": tweets['full_text'].head(20)})
# df2["OpenAI"] = df2["OpenAI"].str.contains("Y")
df2
Out[ ]:
OpenAI Text
0 \nNo RT @RepEscobar: Our country has the moral obligation and responsibility to reunite every single family separated at the southern border.\n\nT…
1 \nNo RT @RoKhanna: What happens when we guarantee $15/hour?\n\n💰 31% of Black workers and 26% of Latinx workers get raises.\n😷 A majority of essent…
2 \nNo (Source: https://t.co/3o5JEr6zpd)
3 \nYes Joe Cunningham pledged to never take corporate PAC money, and he never did. Mace said she’ll cash every check she gets. Yet another way this is a downgrade. https://t.co/DytsQXKXgU
4 \nYes What’s even more gross is that Mace takes corporate PAC money.\n\nShe’s already funded by corporations. Now she’s choosing to swindle working people on top of it.\n\nPeak scam artistry. Caps for cash 💰 https://t.co/CcVxgDF6id
5 \nYes Joe Cunningham already proving to be leagues more decent + honest than Mace seems capable of.\n\nThe House was far better off w/ Cunningham. It’s sad to see Mace diminish the representation of her community by launching a reputation of craven dishonesty right off the bat.
6 \nNo Pretty horrible.\n\nWell, it’s good to know what kind of person she is early. Also good to know that Mace is cut from the same Trump cloth of dishonesty and opportunism.\n\nSad to see a colleague intentionally hurt other women and survivors to make a buck. Thought she’d be better. https://t.co/CcVxgDF6id
7 \nNo RT @jaketapper: .@RepNancyMace fundraising off the false smear that @AOC misrepresented her experience during the insurrection. She didn’t.…
8 \nNo RT @RepMcGovern: One reason Washington can’t “come together” is because of people like her sending out emails like this.\n\nShe should apolog…
9 \nNo RT @JoeNeguse: Just to be clear, “targeting” stimulus checks means denying them to some working families who would otherwise receive them.
10 \nNo Amazon workers have the right to form a union.\n\nAnti-union tactics like these, especially from a trillion-dollar company trying to disrupt essential workers from organizing for better wages and dignified working conditions in a pandemic, are wrong. https://t.co/nTDqMUapYs
11 \nNo RT @WorkingFamilies: Voters elected Democrats to deliver more relief, not less.
12 \nNo We should preserve what was there and not peg it to outdated 2019 income. People need help!
13 \nNo If conservative Senate Dems institute a lower income threshold in the next round of checks, that could potentially mean the first round of checks under Trump help more people than the first round under Biden.\n\nDo we want to do that? No? Then let’s stop playing & just help people.
14 No @iamjoshfitz 😂 call your member of Congress, they can help track it down
15 \nNo All Dems need for the slam dunk is to do what people elected us to do: help as many people as possible.\n\nIt’s not hard. Let’s not screw it up with austerity nonsense that squeezes the working class yet never makes a peep when tax cuts for yachts and private jets are proposed.
16 \nNo It should be $2000 to begin w/ anyway. Brutally means-testing a $1400 round is going to hurt so many people. THAT is the risk we can’t afford.\n\nIncome thresholds already work in reverse & lag behind reality. Conservative Dems can ask to tax $ back later if they’re so concerned.
17 \nNo We cannot cut off relief at $50k. It is shockingly out of touch to assert that $50k is “too wealthy” to receive relief.\n\nMillions are on the brink of eviction. Give too little and they’re devastated. Give “too much” and a single mom might save for a rainy day. This isn’t hard. https://t.co/o14r3phJeH
18 \nNo Imagine being a policymaker in Washington, having witnessed the massive economic, social, and health destruction over the last year, and think that the greatest policy risk we face is providing *too much* relief.\n\nSounds silly, right?\n\n$1.9T should be a floor, not a ceiling.
19 \nYes @AndrewYang @TweetBenMax @RitchieTorres Thanks @AndrewYang! Happy to chat about the plan details and the community effort that’s gone into this legislation. 🌃🌎
In [ ]: