IBM Granite
 |
| https://www.ibm.com/granite/docs/use-cases/cookbooks |
"IBM Granite is a family of large language models developed by IBM. These models are designed to be powerful, reliable, and customizable for enterprise use cases. Here are some key points about IBM Granite:
Model Variants: IBM Granite offers different model sizes and configurations to suit various needs, ranging from smaller models optimized for specific tasks to larger models capable of handling more complex and general-purpose language understanding and generation tasks.
Performance: The models are trained on a diverse and extensive dataset, which includes a mix of licensed data, data created by IBM research, and publicly available data. This training approach aims to ensure high performance across a wide range of natural language processing (NLP) tasks.
Customization: IBM Granite models can be fine-tuned and customized for specific industries or applications. This allows businesses to tailor the models to their unique requirements, improving accuracy and relevance for their specific use cases.
Enterprise Focus: These models are designed with enterprise needs in mind, including considerations for security, compliance, and scalability. IBM provides tools and services to help organizations deploy and manage these models in their environments.
Responsible AI: IBM emphasizes responsible AI practices in the development of Granite models. This includes efforts to mitigate biases, ensure transparency, and maintain ethical standards in AI deployment.
Integration and Deployment: IBM offers tools and platforms to facilitate the integration and deployment of Granite models into existing workflows and applications. This includes APIs, SDKs, and other developer resources to make it easier for businesses to leverage these models.
Applications: IBM Granite models can be used for a variety of applications, such as customer support, content generation, data analysis, and more. Their versatility makes them suitable for numerous business scenarios where natural language understanding and generation are required.
Overall, IBM Granite represents IBM's effort to provide high-quality, enterprise-grade language models that can be adapted and deployed to meet the specific needs of businesses across different industries." -IBM Granite
https://www.ibm.com/granite/docs/models/granite
Chat, search, and research with Granite 4. https://www.ibm.com/granite/playground
......
Inference Examples
Basic Inference
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_path = "ibm-granite/granite-4.0-h-tiny"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
{ "role": "user", "content": "What is the largest ocean on Earth?"},
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
# generate output tokens
output = model.generate(\*\*input_tokens,
max_new_tokens=150, temperature=0)
# decode output tokens into text
output = tokenizer.batch_decode(output)
print(output[0])
......
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
# model_path = ""
model_path = "ibm-granite/granite-4.0-micro"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
chat=[
{"role": "user", "content": "What's the current weather in New York?"},
{"role": "assistant",
"content": "",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": {"city": "New York"}
}
}
]
},
{"role": "tool", "content": "New York is sunny with a temperature of 30°C."},
{"role": "user", "content": "OK, Now tell me what's the weather like in Bengaluru at this moment?"}
]
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location" : {
"description": "The city and state, e.g. San Francisco, CA",
"type": "string",
},
},
"required" : ["location"]
}
}
},
{
"type": "function",
"function": {
"name": "get_stock_price",
"description": "Retrieves the current stock price for a given ticker symbol. The ticker symbol must be a valid symbol f
or a publicly traded company on a major US stock exchange like NYSE or NASDAQ. The tool will return the latest
trade price in USD. It should be used when the user asks about the current or most recent price of a specific stock. It wi
ll not provide any other information about the stock or company.",
"parameters": {
"type": "object",
"properties": {
"ticker" : {
"description": "The stock ticker symbol, e.g. AAPL for Apple Inc.",
"type": "string",
},
}
}
}
}
]
chat = tokenizer.apply_chat_template(chat,tokenize=False, add_generation_prompt=True, tools=tools)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
# generate output tokens
output = model.generate(\*\*input_tokens,
max_new_tokens=100, temperature=0)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# print output
print(output[0])