Skip to content

Conversation

@thiagogomesverissimo
Copy link
Contributor

@yooper,

First, thank you for your efforts in this library, it is so great.
@Euak and I are working in a project and we did some changes that we believe improve the scope of the concordance method from Corpus class.
Changes:

  • Switched from stripos to preg_match_all to find $needle positions
  • The change above allow us to find $needle in any part of a word
  • It possible now choose if $needle is case insensitive or sensitive

Some things that can be better worked in the future:

  • substr and mb_substr did not handle well with latin accent (like ~, ^ etc) so we did utf8_decode before everything and utf8_encode in return
  • We inserted spaces before and after $text to capture first and last word, maybe we need to change the regex

Minor changes:

  • A public domain Brazilian book added to tests
  • A new test added to concordance method using this book

thiagogomesverissimo and others added 4 commits December 18, 2018 17:00
A book from a famous Brazilian writer Assis, Machado.
It is public domain and available at: http://machado.mec.gov.br
Allow to handle latin chars
@yooper
Copy link
Owner

yooper commented Dec 18, 2018

Thanks for your contributions. I will pull it down and test it out and merge within the next couple days.

@yooper yooper merged commit ba5c76f into yooper:master Dec 19, 2018
@Euak
Copy link
Contributor

Euak commented Dec 19, 2018

Thanks, @yooper . It's very kind of you merging it so fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants