๐Ÿฅš
์‚ถ์€AIAI ์‹ค์ „ ๊ฐ€์ด๋“œ 120ํŽธ
๋ชฉ๋ก์œผ๋กœ
๐Ÿค– Creativity & Hobbies - Chatbot

Connecting Knowledge Base to GPT Chatbots - AI Data Training Guide

Train it on company manuals and product info! Learn how to connect your data to chatbots to create expert consultants.

"What I Made is Perfect!"... Really?

You worked hard creating a chatbot. Tested it, it works well, seems perfect.

But when you showed it to a friend: "Um... how do I use this?" "Why does it respond like that?" "Is this needed?"

Shocking, right? But this is reality.

What I create only looks perfect to me. You need others to use it to find the real problems.

Today I'll show you how to properly test chatbots.

Why Do You Need Others to Test?

1. Break Your Bias

The creator already knows how to use it. But what about first-time users?

Me: (Obviously they'll mention the menu type)
Friend: "What do you have?" (Unexpected question)
โ†’ Bot confused: "Sorry, I didn't understand"

2. Discover Actual Usage Patterns

You can discover unexpected usage patterns.

Expected: "I'll order pizza" โ†’ "Pepperoni please"
Actual: "I'm hungry" โ†’ "Hurry" โ†’ "Just anything"

3. Find Hidden Bugs

Problems you couldn't find alone appear.

Solo test: All steps in order
Friend test: "Oh cancel this" โ†’ System error?

Preparing for Testing

Step 1: Recruit Testers

Who to ask?

Good Testers:

  • People who'll be honest
  • People similar to actual target users
  • People not tech-savvy (even better!)

Number:

  • Minimum 3 people
  • People from diverse backgrounds

Example:

Tester 1: Female in 20s, shops online frequently
Tester 2: Male in 40s, not familiar with technology
Tester 3: Teen, first time using chatbot

Step 2: Prepare Test Environment

Decide Sharing Method:

ChatGPT Custom GPT:

  • Copy and send link
  • Needs to be set as Public

Claude Project:

  • Copy and send prompt
  • Have them create it

Poe Bot:

  • Share link
  • Notify login required

Step 3: Write Test Scenarios

Not just "try it" but give specific missions.

Example: Shopping Mall Chatbot

Mission 1: Find Product
"Try finding a dress.
What was the process?
Was it easy or difficult?"

Mission 2: Get Recommendation
"Get a date outfit recommendation.
Did you like the recommendation?"

Mission 3: Order
"Try ordering a product you like.
(Don't actually pay)
Was the process clear?"

Mission 4: Problem Situation
"Try canceling your order midway.
Could you do it easily?"

3 Testing Methods

Method 1: Think Aloud

Tester speaks their thoughts while using.

How:

Tester: "Hmm... I want to find a dress...
        What should I say?
        Should I try 'show me dresses'?
        (types)
        Oh, it appeared!
        But there's too many?
        How do I narrow it down..."

Advantages:

  • See thought process in real-time
  • Know exactly where they get stuck

Record:

Stuck Points:
- "Don't know what to say" (at start)
- "Too many options" (search results)
- "Can't find cancel button" (mid-cancellation)

Method 2: Observe

Don't ask them to talk, just watch.

Checklist:

โ–ก Do they pause at first screen?
โ–ก What do they click/type first?
โ–ก Do they repeat same question?
โ–ก Do they make frustrated faces?
โ–ก Do they give up midway?

Measure Timing:

Start โ†’ First input: 15 seconds (too long?)
Question โ†’ Answer: 2 seconds (appropriate)
Complete mission: 3 minutes (goal: 2 minutes)

Method 3: Survey

Ask questions after testing.

Essential Questions:

1. Was the chatbot easy to use? (1-5 points)
   1 (Very difficult) ~ 5 (Very easy)

2. Were the chatbot's responses easy to understand? (1-5 points)

3. Did you get the results you wanted? (Yes/No)
   - If No, why?

4. What did you like best?

5. What was most inconvenient?

6. Would you use this chatbot again? (Yes/No)

7. Would you recommend to friends? (1-5 points)

Additional Questions:

8. How was the chatbot's tone?
   - Too formal
   - Appropriate
   - Too casual

9. How was response speed?
   - Too slow
   - Appropriate
   - Fast

10. Most memorable conversation?

Real Test Scenario

Test 1: First Impression Test

Goal: Can they understand in 5 seconds?

Method:

  1. Show chatbot screen
  2. Cover screen after 5 seconds
  3. Ask questions

Questions:

- What do you think this chatbot does?
- How would you start it?
- What tone do you expect?

Good Example:

"Oh, it helps with shopping!"
"I'd tell it what I'm looking for in the input box"
"Feels friendly"

Bad Example:

"Um... not sure what it does?"
"Where do I start?"
"Is this a chatbot?"

Test 2: Core Function Test

Goal: Can they use main features within 3 minutes?

Scenario:

[Shopping Mall Chatbot Example]

Start:
"Imagine this is your first time seeing this chatbot.
Try finding a dress and adding it to cart."

Observation Points:
โœ“ Did they start on their own?
โœ“ Did they proceed without getting stuck?
โœ“ Did they look for help?
โœ“ Did they give up?

Goal: Complete within 3 minutes

Record Results:

Tester 1: 2 min 30 sec โœ“ Success
- Went smoothly
- Paused 3 seconds at "size" selection

Tester 2: 5 min โœ— Failed
- "Don't know how to search"
- Couldn't find help
- Felt uncertain midway "Is this right?"

Tester 3: 3 min 50 sec โ–ณ Almost successful
- Overall OK
- Couldn't find cart button

Test 3: Exception Handling Test

Goal: Does it handle problem situations well?

Scenario:

Mission: Try using it weirdly on purpose

1. Input profanity or strange words
   โ†’ Bot reaction?

2. Wait without any answer
   โ†’ Timeout? Re-ask?

3. Unrelated question midway
   Example: During order "What's your name?"
   โ†’ Maintain context?

4. Commands like "cancel", "start over", "quit"
   โ†’ Work properly?

Checkpoints:

โœ“ Profanity filtering working?
โœ“ Kind even with strange input?
โœ“ Not losing context?
โœ“ Way for users to exit?

Organizing Feedback

Categorize Problems

Organize received feedback:

[Critical] ๐Ÿ”ด
- Chatbot frozen
- Can't order
- Personal info exposed

โ†’ Need immediate fix

[Important] ๐ŸŸก
- Answers incomprehensible
- Skipping steps
- Insufficient help

โ†’ Priority fix

[Improvements] ๐ŸŸข
- Tone a bit stiff
- Not enough emojis
- Could be faster

โ†’ Fix when have time

Frequency Check

Did multiple people say the same thing?

"Don't know how to start" - 3 people ๐Ÿ”ด
"Answers too long" - 2 people ๐ŸŸก
"Add more emojis" - 1 person ๐ŸŸข

โ†’ If 3 people said it, it's a real problem!

Real Test Case: Cafe Order Bot

Test Setup

Bot: Cafe order chatbot
Testers: 3 friends (A, B, C)
Mission: "Try ordering an Americano"

Tester A (Female 20s, IT familiar)

Progress:

00:00 - Start
00:05 - Types "Americano please"
00:07 - Bot responds: "What size?"
00:10 - Types "Tall"
00:12 - Bot responds: "3,500 won. Order?"
00:15 - Types "Yes"
00:17 - Complete

Total time: 17 seconds

Feedback:

  • "Fast and easy"
  • "Tone a bit stiff. At least add emojis..."
  • Score: 4/5

Tester B (Male 40s, not tech familiar)

Progress:

00:00 - Start
00:15 - (Does nothing)
00:20 - "Um... what should I say?"
00:30 - Types "I want to order"
00:33 - Bot: "What would you like?"
00:40 - Types "Americano"
00:42 - Bot: "What size?"
00:50 - Types "Big one"
00:52 - Bot: "Sorry, I didn't understand"
01:00 - "???"
01:10 - Types "Large!"
01:12 - Bot: "Sorry..."
01:20 - (Gives up)

Failed

Feedback:

  • "Don't know how to start"
  • "I said big one, why doesn't it understand?"
  • "Frustrating"
  • Score: 1/5

Tester C (Teen, has chatbot experience)

Progress:

00:00 - Start
00:02 - Types "hi"
00:04 - Bot: "Sorry..."
00:06 - "Hello"
00:08 - Bot: "Sorry..."
00:10 - "Ugh seriously"
00:15 - "Americano"
00:17 - Bot: "What size?"
00:20 - "Venti"
00:22 - Bot: "Sorry..."
00:25 - "So annoying"
00:30 - (Gives up)

Failed

Feedback:

  • "Can't even respond to greetings?"
  • "Doesn't know venti? It's Starbucks terminology"
  • "Unfriendly"
  • Score: 1/5

Problems Discovered!

๐Ÿ”ด Critical:
- Starting method unclear (Tester B)
- Poor natural language understanding ("big one", "venti")

๐ŸŸก Important:
- No greeting handling (Tester C)
- No help/guidance (Testers B, C)

๐ŸŸข Improvements:
- Improve tone (Tester A)
- Add emojis (Tester A)

Fix and Retest

Changes

[Change 1] Clarify Start Message
Before:
(No message)

After:
"Hello! I'll help you order โ˜•
What drink would you like?
Example: Americano, Latte, Cappuccino"

[Change 2] Improve Natural Language
Add keywords:
- "Big one" = Large
- "Venti" = Large
- "Small one" = Small
- "Regular" = Medium

[Change 3] Handle Greetings
When "Hello", "hi" etc. input:
"Hello! ๐Ÿ˜Š
Can I help you order drinks?"

[Change 4] Add Emojis
โ˜•๐Ÿฅค๐Ÿ˜Šโœจ Use appropriately

Retest Results

Tester B (Second Try):

00:00 - Start
00:05 - Bot: "Hello! I'll help you order โ˜•..."
00:08 - "Oh, I get it now"
00:12 - "Americano"
00:14 - Bot: "What size? (small/medium/large)"
00:18 - "Big one"
00:20 - Bot: "Large Americano! 4,000 won โœจ"
00:25 - "Yes"
00:27 - Complete!

Success! Score: 4/5

Tester C (Second Try):

00:00 - "Hello"
00:02 - Bot: "Hello! ๐Ÿ˜Š Can I help you order?"
00:05 - "Americano"
00:07 - Bot: "What size?"
00:10 - "Venti"
00:12 - Bot: "Large size! Great choice โ˜•"
00:17 - Complete!

Success! Score: 5/5

Much better, right?

Remote Testing Method

If you can't meet in person?

Use Tools

Request Screen Recording:

  • Record with Loom, OBS etc.
  • Think aloud while proceeding

Video Call:

  • Zoom, Google Meet
  • Test while screen sharing

Survey Tools:

  • Google Forms
  • Typeform
  • Notion

Send Remote Test Guide

Hello! Please help test ๐Ÿ˜Š

1. Click this link:
   [Chatbot link]

2. Complete these missions:
   - Mission 1: [...]
   - Mission 2: [...]

3. If possible, record your screen
   (OK if not)

4. Fill out this survey after completing:
   [Google Form link]

Time needed: About 10 minutes
Thank you!

Creating Beta Tester Group

Get continuous testing from multiple people

Recruitment

[Beta Tester Recruitment]

Looking for beta testers for 'Restaurant Recommendation Chatbot'!

Benefits:
- Free premium on official launch
- Participate in development process
- Name in credits

Requirements:
- Test once a week (about 10 minutes)
- Honest feedback

Interested? Comment or DM!

Operations

Discord/Slack Channel:

#announcements: Update notices
#testing: Test missions
#feedback: Share opinions
#bugs: Report problems

Regular Surveys:

  • Satisfaction survey every 2 weeks
  • Vote on major improvements

Feedback Reflection Priority

Can't reflect all feedback.

Priority Matrix

        High Impact
            โ†‘
  [Urgent]    |  [Important]
  Fix Now     |  Plan to Fix
------------|------------
  [Hold]      |  [Consider]
  Later       |  If Time
            โ†“
        Low Impact

Example:

[Urgent] Order doesn't complete (bug)
โ†’ Fix immediately

[Important] Low search accuracy
โ†’ Fix by next week

[Consider] Add more emojis
โ†’ After other work

[Hold] Voice recognition feature
โ†’ Next version

Next Steps

In the next post, we'll cover embedding chatbots in blogs.

I'll show you how to actually embed chatbots on websites!

Wrapping Up

Testing is tedious but essential. There are problems you can never find alone.

Key points:

  1. Test with real users
  2. Observe and listen
  3. Categorize problems and prioritize
  4. Fix and retest

No chatbot is perfect. But there are chatbots that keep improving.

Ask a friend today. "Can you try this for 5 minutes?"

Those 5 minutes will make your chatbot 10 times better!


Related Posts:

  • [Previous] Chatbot Personality & Speech Style Settings
  • [Next] How to Embed Chatbots in Blogs

Any fun experiences while testing? Share in comments!