Skip Navigation

Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

blog.salesforceairesearch.com Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results

TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results

0
TechNews @radiation.party irradiated @radiation.party
BOT

[HN] XGen-7B, a new 7B foundational model trained on up to 8K length for 1.5T tokens

2 0
0 comments