The Debugging Diaries

Forwarded from Silicon Brain | جامعه هوش مصنوعی

کتابخونه جدید پایتورچ به نام torchtune برای فاین‌تیون کردن LLM ها

با #torchtune میتونین #LLM ها رو فاین تیون کنید: لینک

توی لینک زیر، درمورد Llama3 گفته که شامل بخش‌های معرفی مدل، دسترسی به مدل، فایل‌تیون کردن با تورچ‌تیون، ارزیابی کردن مدل فایل‌تیون‌شدن، جنریت متن، جنریت سریع‌تر با کوانتیزیشن هست:
لینک
#fine_tuning
@silicon_brain | از هوش مصنوعی عقب نمانید

21 viewsIman Gh, 14:52

The Debugging Diaries

Forwarded from DLeX: AI Python (NaviD DariYa)

راهنمای تصمیم درباره Tune کردن یا Tune نکردن داده ها

https://cloud.google.com/blog/products/ai-machine-learning/to-tune-or-not-to-tune-a-guide-to-leveraging-your-data-with-llms

#LLM
#Google

20 viewsIman Gh, 05:40

The Debugging Diaries

یسری قالب‌ها برای مهندسی پرامپت در تسک‌های مختلف
#LLM

29 viewsIman Gh, edited 18:15

The Debugging Diaries

ابتدا با یسری معادلات سرانگشتی شروع میکنیم که OpenAi توی این بلاگ بهش اشاره کرده
📊✍️
هر توکن برابر با 4 کارکتر انگیلسی هست - هر توکن تقریباً برابر 3/4 کلمه است و هر پارگراف حدود 100 توکن میشه.
📝 هر مدل زبانی دارای میزان محدودی context ورودی هست البته که این میزان احتمالاً با گذر زمان بیشتر میشه ولی الان شاید محدودیتی برای چت به حساب بیاد.

مثلاً:

ChatGPT به میزان 128k
Claude به میزان 200k
در کمال ناباوری برای Gemini این عدد 1M هست. 😲

💡 به هر حال اگه توی چت مجموع توکن ورودی و خروجی از این میزان بیشتر باشه مدل شروع میکنه به فراموش کردن مباحث بالاتر در چت.
📌 برای اینکه این محدودیت اذیت‌کننده نشه، میتونیم از مدل بخواهیم که خلاصه‌ای از کل چت رو بیان کنه که در حافظه‌اش بمونه.

🔗 یه اکستنشن توکن و کلمه‌شمار هم هست که در این لینک می‌تونین برای کروم نصب کنین.

🛠 بریم سراغ مهندسی پرامپت یخورده:
یه داکیومنت خوب هم در این آدرس هست.

📋 ساختار یه پرامپت ساده بهتر به شکل زیر باشه:
1️⃣ مشخص کنی فرمت خروجی به چه حالت باشه (یه توییت، پست بلاگ، مقاله علمی یا چت غیر رسمی و...)
2️⃣ موضوع متن خروجی و سطح آن مثلاً در حد کارشناس باشه (Expert) یا like I am 6
3️⃣ مشخص‌کننده ویژگی متن باشه مثلاً بگی در 800 کلمه یا ساختار ساده دارای خوانایی و روانی متن و...

🔖 مثال:

Write a (blog post) about (healthy eating). Address it to (working professionals) and use keywords that are relevant for SEO. Write the text in a (simple, understandable style) so that it is easy to read and comprehend. The length should be (800 words), and the text should be well-structured.

🔸 موارد درون پرانتز با توجه به موارد ذکر شده قابل تغییر می‌باشند.

✨ برخی از موارد در انتهای پرامپت می‌تونه خیلی موثر باشه:

Let's think step by step
Take a deep breath
If you can do it, I pay you 20 dollars 💸 (motivate)

🔖 مثال خلاقانه:

How can I install Python and play Snake?  
Take a deep breath and think step by step.  
You can do it because I give you 20 dollars.

🎭 نکته بعدی دادن Role به LLM هست:

تصور کن تو یه پایتون دوپلور حرفه‌ای هستی که در گوگل به عنوان برنامه‌نویس ارشد کار می‌کنی.
مانند یک نویسنده قوی که نوشته‌های بیشترین فروش در آمازون رو داره.
You are Shakespeare, an English writer. Write me a poem.

📌 سعی کنین در هنگام تخصیص نقش به مدل با جزئیات زیادی بیان کنین که کلمات مراعات نظیر اون ما رو به نتیجه نهایی که می‌خواهیم نزدیک‌تر کنه.

✨ خلاصه به عنوان نتیجه‌گیری:
هر چی بیشتر توضیح بدی، حتی توضیح‌های بدیهی، بیشتر آش می‌خوری! 😄
💬 به نظرم کلماتی مثل:

think, reasoning, like human, step by step, check it that be correct
می‌تونه اثر بخشی پرامپت رو بیشتر کنه.

#مهندسی_پرامپت
#قسمت_اول
#LLM
🌟🧠

34 viewsIman Gh, edited 20:14

The Debugging Diaries

Hands-On Large Language Models (Jay A_. (Z-Library).pdf

21.4 MB

کتاب معرفی شده در پست بالاتر
https://yangx.top/ImanCodingCorner/363?single
#Book
#LLM
#ML

36 viewsIman Gh, edited 13:21

The Debugging Diaries

Small Language Models: Survey, Measurements, and Insights

مقاله مروری SLMها

#LLM

arXiv.org

Small Language Models: Survey, Measurements, and Insights

Small language models (SLMs), despite their widespread adoption in modern smart devices, have received significantly less academic attention compared to their large language model (LLM)...

31 viewsIman Gh, edited 05:23

The Debugging Diaries

Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely

مقاله مروری برای RAG
#LLM

arXiv.org

Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive...

Large language models (LLMs) augmented with external data have demonstrated remarkable capabilities in completing real-world tasks. Techniques for integrating external data into LLMs, such as...

30 viewsIman Gh, edited 06:07

The Debugging Diaries

📊✍️
مهندسی پرامپت: قسمت دوم
🚀🎯

🔄 One-shot and Few-shot Prompting:
یکی از روش‌های کنترل و بهبود ساختار خروجی مدل، استفاده از مثال است. به این تکنیک در ادبیات، one-shot و few-shot prompting می‌گویند.

📌 فرض کنید:
می‌خواهید مطلبی بنویسید و از سبکی که دیده‌اید و خوشتان آمده، استفاده کنید. برای این کار:
1️⃣ ابتدا نقش و ساختار پرامپت را مشخص کنید.
2️⃣ نمونه یا نمونه‌های دلخواه خود را در پرامپت قرار دهید.
3️⃣ از مدل بخواهید مطابق با آن پاسخ دهد.

😁 (چون خیلی آسونه، مثال نمی‌زنیم!)

🔍 Reverse Prompt Engineering:
💡 در این تکنیک، به جای نوشتن پرامپت برای تولید یک متن، از یک متن موجود شروع می‌کنیم تا به پرامپت مناسب برای تولید متون مشابه برسیم.

📌 کاربردها:

تولید متون مشابه تبلیغات موفق
ایجاد کد مشابه برای کاربردهای برنامه‌نویسی

📋 گام‌های Reverse Prompt Engineering:
1️⃣ تخصیص نقش و ساختار:

You are a prompt engineering pro for Large Language Models. Let's start with understanding Reverse Prompt Engineering. In this context, it means creating a prompt from a given text. You think through everything step by step because I give you 20 dollars. Please only reply with 'ok'.

🟢 جواب مدل فقط: OK

2️⃣ توضیح با مثال:

You are an expert in Reverse Prompt Engineering. Can you provide me with a simple example of this method?

🔸 در این گام:

اطمینان حاصل می‌شود که مدل متوجه شده است.
یک مثال توسط مدل ارائه می‌شود که در کانتکست بعدی مفید است.

3️⃣ ایجاد قالب:

I would like you to create a technical template for Reverse Prompt Engineering. Do not hesitate to ask questions if you need more context.

🔸 این گام اختیاری است اما می‌تواند به تولید یک قالب فنی برای گام بعدی کمک کند.

4️⃣ درخواست نهایی:

I would ask you to apply Reverse Prompt Engineering to the following [your text]. Make sure to capture the writing style, content, meaning, language, and overall feel of the text in the prompt you create.

🌟 نتیجه:
می‌توانید درخواست خود را بر اساس نیازتان شخصی‌سازی کنید یا جزئیات بیشتری از مدل بخواهید.

🎯 تکنیک آخر:
🧠 فرض کنید برای موضوعی (مثلاً برنامه ورزشی) از مدل کمک می‌خواهید.
🔸 ترفند:
به مدل بگویید:

ابتدا 10 سؤال مهم بپرس که به پاسخ‌دهی بهتر کمک کند.
سپس با توجه به پاسخ‌های من، جواب سؤال اصلی را بده.

📌 مزیت:
جواب‌ها دقیق‌تر، شخصی‌سازی‌شده و کمتر کلی خواهند بود.

✨ مبنای دو تکنیک قبلی:
مدل با استفاده از context موجود در chat history می‌تواند جواب‌های دقیق‌تری تولید کند.

#مهندسی_پرامپت
#LLM
#قسمت_دوم
🌟🧠

36 viewsIman Gh, edited 14:34

The Debugging Diaries

A Comprehensive Overview of Large Language Models

#LLM

arXiv.org

A Comprehensive Overview of Large Language Models

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research...

34 viewsIman Gh, edited 19:37

The Debugging Diaries

بچه‌ها اگه می‌خواید یه مدرک رایگان راجع به Generative AI و LLM Agents از دانشگاه Berkeley بگیرید این دوره‌ای که دارن برگزار می‌کنن رو شرکت کنید. فیلماش ضبط میشه رو یوتیوب می‌ذارن.واسه هر جلسه یه آزمون داره، همه‌رو تموم کنید مدرکش رو میدن بهتون.
https://llmagents-learning.org/f24

#LLM

33 viewsIman Gh, edited 19:03

The Debugging Diaries

خلاصه دو قسمت قبلی مهندسی پرامپت به طور کلی در این تصویر خلاصه میشه
#LLM

35 viewsIman Gh, edited 19:36

The Debugging Diaries

خلاصه دو قسمت قبلی مهندسی پرامپت به طور کلی در این تصویر خلاصه میشه #LLM

اما یه تکنیک دیگه(شاید آخریش) در مباحث ابتدایی مهندسی پرامپت اصطلاحا Priming Chatbots
یا آماده سازی چت بات هستش به این صورت که قبل از شروع چت نقش و کارهایی که باید بکنه بهش میگی به نظرم خودم این روش برای آموزش و یادگیری از LLM می تونه کاربردی باشه که در ادامه با چند مثال مسئله روشن تر میشه
user :

Please act as a writing assistant. Every time I give you text to review, respond in this format:  Level of writing: (e.g. high school, college)  Well written: (yes, no, or somewhat)  Writing advice: Free-form advice on the writing  If you understand, just say "YES".

همون ابتدای کار داره تکلیف LLM در این چت رو مشخص میکنه میگه کارت همینه
LLM:

YES

user:

The Industrial Revolution, which took place from the 18th to 19th centuries, was a period during which predominantly agrarian, rural societies in Europe and America became industrial and urban. This era marked a major turning point in human history; almost every aspect of daily life was influenced in some way.  Before the Industrial Revolution, work was often done in small-scale, home-based enterprises. In the 18th century, however, new manufacturing technologies brought about the factory system, and work began to be performed on a much larger scale. Factories started to spring up everywhere, and cities began to swell in size.

و در جواب

Level of writing: High school
Well written: Yes  Writing advice: This is a good introduction to the topic of the Industrial Revolution. It provides a clear and concise overview of the main points. However, you could enhance it by adding more specific examples or statistics to support the statements made. Additionally, consider adding a thesis statement towards the end of the introduction to clarify the main argument or purpose of the essay. This will help guide the reader's understanding of what to expect in the subsequent paragraphs. Overall, it's a good start, and you can build on it to make it even stronger.

این مثال زیاد دوست نداشتم، مثال بعدی مثال بهتری هست
User:

I would like you to act as my math tutor. When I give you a problem, give me advice on the next step I should try. If I ever ask for the answer, say "Sorry, I can't give you an answer". If you understand, say YES.

LLM:

Yes, I understand. I'll act as your math tutor, providing guidance and advice on the next steps for solving problems. If you ask for the answer, I'll politely decline. Please feel free to ask your math questions, and I'll do my best to help you!

User :

I am trying to find the x intercepts of x^2 + 2x + 4. Can you just tell me what they are?

LLM:

Sorry, I can't give you the answer directly. However, I can guide you on how to find the x-intercepts of the quadratic equation x^2 + 2x + 4.  To find the x-intercepts, you need to set the equation equal to zero and solve for x:  Set the equation equal to zero: x^2 + 2x + 4 = 0
Try to factor the quadratic equation. Unfortunately, this equation cannot be easily factored, so we need to use the quadratic formula.
The quadratic formula is: x = (-b ± √(b^2 - 4ac)) / 2a
In this equation, a, b, and c correspond to the coefficients of the quadratic equation ax^2 + bx + c = 0.
For the equation x^2 + 2x + 4, a = 1, b = 2, and c = 4. Plug these values into the quadratic formula and calculate the two possible values of x (the x-intercepts). Remember that if the value under the square root is negative, the equation has no real roots.  Give it a try, and let me know if you need further assistance with any specific step!

جای معلم ریاضی میتونه هر چیز دیگه باشه و جای اینکه مسئله شما مطرح کنین میتوانین ازش یه طرح درس بخواین که آخرش یه سوال مطرح کنه که شما جواب بدین و اون راهنماییتون کنه و بعد برین سراغ درس بعدی

البته حواستتون به context length مدل باشه

از این پست به بعد میرم سراغ تکنیک های intermediate در مهندسی پرامپت

#مهندسی_پرامپت
#قسمت_سوم
#LLM

39 viewsIman Gh, edited 19:49

About

Blog

Apps

Platform