Skip to main content
Glama

Dingo MCP Server

by MigoXLab
rules.md8.94 kB
The specific rules for each quality metric are as follows: | Function Name | Type | Description | Reference | |------------------------------|-------------------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | RuleAlphaWords | EFFECTIVENESS | check whether the ratio of words that contain at least one alphabetic character > 0.6 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) | | RuleCapitalWords | UNDERSTANDABILITY | check whether capital words ratio > 0.2 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) | | RuleCharNumber | EFFECTIVENESS | check whether the number of char > 100 | [MAP-en](https://arxiv.org/abs/2405.19327) | | RuleColonEnd | COMPLETENESS | check whether the last char is ':' | | | RuleContentNull | EFFECTIVENESS | check whether content is null | | | RuleCurlyBracket | UNDERSTANDABILITY | check whether the ratio of the number of {,} and the number of characters < 0.025 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [C4](https://arxiv.org/abs/1910.10683) | | RuleDocRepeat | SIMILARITY | check whether content repeats | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) [Gopher](https://arxiv.org/abs/2112.11446) | | RuleHtmlEntity | RELEVANCE | check whether content has html entity | | | RuleIDCard | SECURITY | check if the content contains ID card. | | | RuleLineEndWithEllipsis | COMPLETENESS | check whether the ratio of line ends with ellipsis < 0.3 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) | | RuleLineEndWithTerminal | COMPLETENESS | check whether the ratio of line ends with terminal punctuation mark > 0.6 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) [C4](https://arxiv.org/abs/1910.10683) | | RuleLineStartWithBulletpoint | UNDERSTANDABILITY | check whether the ratio of line starts with bullet points < 0.9 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) | | RuleLineJavascriptCount | EFFECTIVENESS | check whether line with the word Javascript. | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) [C4](https://arxiv.org/abs/1910.10683) | | RuleLoremIpsum | EFFECTIVENESS | check whether the ratio of lorem ipsum < 3e-08 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) [C4](https://arxiv.org/abs/1910.10683) | | RuleMeanWordLength | EFFECTIVENESS | check whether the mean length of word in [3, 10] | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) | | RuleNoPunc | FLUENCY | check whether paragraph has no punctuation. | | | RuleSentenceNumber | COMPLETENESS | check whether the number of sentence in [3, 7500] | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [FineWeb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) [C4](https://arxiv.org/abs/1910.10683) | | RuleSpecialCharacter | RELEVANCE | check whether content has special characters. | | | RuleStopWord | EFFECTIVENESS | check whether the ratio of stop word > 0.06 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) | | RuleSymbolWordRatio | EFFECTIVENESS | check whether the ratio of symbol / word is > 0.4 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) | | RuleUniqueWords | UNDERSTANDABILITY | check whether the ratio of unique words > 0.1 | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) | | RuleWatermark | RELEVANCE | check whether content has watermarks. | | | RuleWordNumber | EFFECTIVENESS | check whether the number of word in [20, 100000] | [Redpajama](https://www.together.ai/blog/redpajama-data-v2) [MAP-en](https://arxiv.org/abs/2405.19327) [Gopher](https://arxiv.org/abs/2112.11446) [Dolma](https://arxiv.org/abs/2402.00159) |

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MigoXLab/dingo'

If you have feedback or need assistance with the MCP directory API, please join our Discord server