README.md · internlm/internlm-xcomposer2-7b at 194829f15db858de1cfca2107854e3ed2fd41b9c

metadata

license: apache-2.0
pipeline_tag: text-generation

InternLM-XComposer2

💻Github Repo

InternLM-XComposer2 is a vision-language large model (VLLM) based on InternLM2 for advanced text-image comprehension and composition.

We release InternLM-XComposer2 series in two versions:

InternLM-XComposer2-VL: The pretrained VLLM model with InternLM2 as the initialization of the LLM, achieving strong performance on various multimodal benchmarks.
InternLM-XComposer2: The finetuned VLLM for Free-from Interleaved Text-Image Composition.

Import from Transformers

To load the InternLM-XComposer2-7B model using Transformers, use the following code:

import torch
from PIL import image
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "internlm/internlm-xcomposer2-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
model = model.eval()
model.vit.resize_pos()
img_path_list = [
    './panda.jpg',
    './bamboo.jpeg',
]
images = []
for img_path in img_path_list:
    image = Image.open(img_path).convert("RGB")
    image = model.vis_processor(image)
    images.append(image)
image = torch.stack(images)
query = '<ImageHere> <ImageHere>please write an article based on the images. Title: my favorite animal.'
response, history = model.chat(tokenizer, query=query, image=image, history=[], meta_instruction='')
print(response)
# in this animal kingdom, there are many species of animals. Each animal has its own special charm and characteristics. Among them, I like pandas the most. Pandas have a big black circle on their white furry faces, so they look very cute. It's not surprising that people call them "bearcats." But do you know why they're called pandas? Because pandas only eat bamboo shoots.\n\npandas' favorite food is bamboo shoots. The color of fresh bamboo shoots is light green. There is some starch in it, which can be used to make delicious food. But because panda's stomach doesn't produce amylase, it needs to consume large amounts of bamboo shoots every day to meet its body's nutritional needs. As a result, pandas spend most of their time eating bamboo, as well as sleeping. However, pandas cannot eat any meat except bamboo shoots. When pandas are hungry, they may go into the field to look for ants or other insects to eat. In fact, when pandas really want to eat meat, they can easily get away from it.\n\nbesides their love of eating bamboo, pandas also have another interesting characteristic: they always walk backwards. This makes them look slow and lazy. Although they seem sluggish, they actually run at speeds of up to 35 km/h (21.7 mph) when they need to escape danger! So don't underestimate pandas just because they're lazy!\n\nunfortunately, due to the destruction of natural habitats by humans, there are currently less than 1,800 pandas left in the world. I hope everyone can help me save pandas and protect our environment!

通过 Transformers 加载

通过以下的代码加载 InternLM-XComposer2-7B 模型

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "internlm/internlm-xcomposer2-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
# `torch_dtype=torch.float16` 可以令模型以 float16 精度加载，否则 transformers 会将模型加载为 float32，导致显存不足
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
model = model.eval()
model.vit.resize_pos()
img_path_list = [
    './panda.jpg',
    './bamboo.jpeg',
]
images = []
for img_path in img_path_list:
    image = Image.open(img_path).convert("RGB")
    image = model.vis_processor(image)
    images.append(image)
image = torch.stack(images)
query = '<ImageHere> <ImageHere>请根据图片写一篇作文：我最喜欢的小动物。要求：选准角度，确定立意，明确文体，自拟标题；不要套作，不得抄袭；不得泄露个人信息。'
response, history = model.chat(tokenizer, query=query, image=image, history=[], meta_instruction='')
print(response)
# 我最喜欢的小动物\n说起我喜欢的动物，那可多了，有活泼可爱的小白兔、机灵的猴子、忠诚的狗……但是我最喜欢的还是可爱的大熊猫。\n大熊猫是哺乳动物中的一种，主要分布在中国四川、陕西和甘肃等地的山区。它有着大大的眼睛，圆圆的耳朵，胖乎乎的身子，最特别的是它的身体黑白相间，所以大家都叫它“黑白仔”。\n因为它的长相很呆萌，很多人都特别喜欢它，于是就有了许多关于它的玩具。在动物园里可以看到许多熊猫玩具和熊猫主题的衣服，还看到很多小朋友抱着熊猫玩偶在玩呢！\n说到熊猫吃竹子了，那可是它们的最爱，几乎每天都吃不腻。别看它长得肥肥胖胖的，其实它也很瘦啊，都是被肚子里的竹子给撑大的哦！熊猫每次吃东西的时候，都会用两只前爪抓住竹子，然后津津有味地吃起来。\n熊猫虽然看上去温顺又憨厚，但是它发起脾气来也是不客气的。如果你去逗它，惹得它生气了，它会举起它的两个爪子，往你身上挥舞着。这时候你可不能还手哦，因为它那一巴掌下去，可不是闹着玩的，会把你打得鼻青脸肿的哦！如果它觉得无聊了，也会把自己扔进竹筐里来回滚动，好像一个球一样在地上翻滚。看着就让人忍不住想去摸摸它，抱抱它。\n你们知道吗？现在我们的国宝大熊猫已经濒临灭绝了，所以现在我们要好好保护大熊猫，让他们健康快乐地成长。