Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evidences do not exist in chat histories #39

Open
Enting-Chen opened this issue Feb 27, 2025 · 3 comments
Open

Evidences do not exist in chat histories #39

Enting-Chen opened this issue Feb 27, 2025 · 3 comments

Comments

@Enting-Chen
Copy link

Enting-Chen commented Feb 27, 2025

Hi,

I noticed some of the evidences in the query set does not have corresponding chat conversations:

    "5": {
        "question": "Did Li Hua send a message to Jennifer asking for her opinion on protein supplements before he consulted her about his daily protein powder consumption?",
        "answer": "Yes",
        "evidence": "20260213_16:00<and>20261120_20:00",
        "type": "Multi"
    },

Evidence 20260213_16:00 does not exist in the chat histories. Do you mean 20260214_16:00?

    "56": {
        "question": "Did Li Hua share a blog post about his recent fitness achievements after Jennifer sent him a motivational message?",
        "answer": "Yes",
        "evidence": "20260129_14:00<and>20260520_18:00<and>20260606_09:00<and>20260817_12:15<and>20261022_22:00<and>20261202_14:00",
        "type": "Multi"
    }

I cannot find conversation with the time 20260129_14:00.

I also noticed some ending whitespaces, missing colons between hour and minute, and "why"/"what"/"time:" in the evidence field. Some questions/answers referred to conversations in 2027. Could you help clean these up please?

Thanks!

@Enting-Chen Enting-Chen changed the title Evidence 20260213_16:00 Evidences do not exist in chat histories Feb 27, 2025
@TianyuFan0504
Copy link
Collaborator

Hi @Enting-Chen , thanks for your valuable suggestions. We are currently conducting a comprehensive review of our dataset. I will republish it once the review is complete.

@jameswangthecoder
Copy link
Collaborator

Dear @Enting-Chen ,

I have cleaned up the mistakes you have mentioned, along with some other tiny mistakes I found. Good catch, by the way!

Please do not hesitate to let us know if you have encountered any other issues (including any other typos or small mistakes just like those). We'll fix them as soon as possible to guarantee the smooth experience with MiniRAG.

Thanks!

Best,
James

@Enting-Chen
Copy link
Author

Enting-Chen commented Feb 28, 2025

Hi, thanks for providing the updated dataset.

I noticed some questions still have "why/what/time: " in its evidence. Some have missing colons in between hour and minute.

Could you review these please?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants