Skip to main content
December 28, 2009
Answered

Php help

  • December 28, 2009
  • 1 reply
  • 1203 views

I have been having trouble with this for months.  I have tried everything that I can think of and still cannot get it to work.

Attached is a description of my problem, what I want it to do and what it is actually doing.  It seems that I can only get two extremes - either too much or too little - not what I want in the middle.

Think the problem may be in the regexp or the loop.

Please help!

This topic has been closed for replies.
Correct answer David_Powers

You don't need to do anything about spaces, because they're removed by explode(). To get rid of non-word characters at the end of each word, use the regex metacharacter for a word character (\w). This matches A-Z, a-z, 0-9, and the underscore. You should also add the hyphen. So, this would do it:

foreach ($words as $word) {

  preg_match('/^([-\w]+)/', $word, $m);

  echo $m[1] . '<br />';

}

If you want to exclude numbers and the possibility of underscores, change the regex to this:

foreach ($words as $word) {

  preg_match('/^([-A-Za-z]+)/', $word, $m);

  echo $m[1] . '<br />';

}

Don't despair over the time it has taken you. My knowledge of PHP has been built up over eight or nine years. I don't have the benefit of an education in computer science, either. So I didn't have a running start.

What I have discovered is that the best way to approach a problem is by breaking it down into small steps. Instead of thinking in terms of code, I think in terms of what I want to happen. It helps if you start out with a skeleton of comments:

// do this

// then do this

// if (condition A) {

  // do one thing

} elseif (condition B) {

  // do something else

} else {

  // do something completely different

}

I then work on each section separately until I get it right. Sometimes, I end up with a solution that works, but is rather complex. So, I then see if there are ways to simplify it. When creating this solution for you, I first gathered the words into a multidimensional array, which involved a complex loop to access the individual words. It worked, but I thought there must be a simpler way of gathering the words into a simple array in the first place. It took a couple of attempts before I hit on array_merge(). Initially, I used array_push(), but that still created a multidimensional array.

One very important technique is to use echo or print_r() to examine the results of a function or loop. It helps enormously if you can see the results of each stage of the operation.

The rest simply comes down to a lot of reading. When I first started doing PHP, I bought "Programming PHP" by Rasmus Lerdorf and Kevin Tatroe. It's not an easy book to sit down and read from cover to cover, because it's rather like a grammar book. However, I did read the first half of the book to get an overview of the core features of the language. Since then, I keep it close at hand. It's one of the most heavily-thumbed and dog-eared books in my collection. You don't even need a copy of the book. If you go through the Strings and Arrays sections of the PHP online manual, you will learn a huge amount of useful material. I don't keep everything in my head, but I usually know where to look up details of the function I'll need.

Keep at it. You'll get there in the end.

1 reply

David_Powers
Inspiring
December 29, 2009

brywilson88 wrote:

Think the problem may be in the regexp or the loop.

The regex is fine. The problem lies in the loop. This does what you want:

// find all matching phrases
preg_match_all('/<b>(.*?)<\/b>/', $content, $search_results);

// get the number of results
$num_results = count($search_results[1]);

// initialize an array to store individual words
$words = array();

// split each matched phrase into words
// and add to the  array
for ($i = 0; $i < $num_results; $i++) {
  $words = array_merge($words, explode(' ', $search_results[1][$i]));
}

// loop through the  array to display each word on a separate line
foreach ($words as $word) {
  echo $word . '<br />';
}

The first loop uses explode() to split individual results into separate words, and then adds them to the $words array using array_merge(). This leaves you with an array called $words that contains all the individual words (including repeated ones).

December 29, 2009

It is amazing what you know – that worked very well.  Wish I could walk in your shoes for just one day!

Another question – how would I now go about cleaning up the words – i.e. removing commas, apostrophes (and whatever follows the apostrophes), spaces (before and after words) and all the other junk before, after and between words?

I have tried some things that I thought would work like (trim, strip_tags and preg_replace) – but they don’t do anything.  I tried these right before I printed the word – maybe I should do this before I get into the final loop? Or, is there something I need to do to the regexp or create another preg-match_all to strip out all the junk?

Thanks again for your help - you were able to do in minutes what I have been struggling with for months.

David_Powers
David_PowersCorrect answer
Inspiring
December 30, 2009

You don't need to do anything about spaces, because they're removed by explode(). To get rid of non-word characters at the end of each word, use the regex metacharacter for a word character (\w). This matches A-Z, a-z, 0-9, and the underscore. You should also add the hyphen. So, this would do it:

foreach ($words as $word) {

  preg_match('/^([-\w]+)/', $word, $m);

  echo $m[1] . '<br />';

}

If you want to exclude numbers and the possibility of underscores, change the regex to this:

foreach ($words as $word) {

  preg_match('/^([-A-Za-z]+)/', $word, $m);

  echo $m[1] . '<br />';

}

Don't despair over the time it has taken you. My knowledge of PHP has been built up over eight or nine years. I don't have the benefit of an education in computer science, either. So I didn't have a running start.

What I have discovered is that the best way to approach a problem is by breaking it down into small steps. Instead of thinking in terms of code, I think in terms of what I want to happen. It helps if you start out with a skeleton of comments:

// do this

// then do this

// if (condition A) {

  // do one thing

} elseif (condition B) {

  // do something else

} else {

  // do something completely different

}

I then work on each section separately until I get it right. Sometimes, I end up with a solution that works, but is rather complex. So, I then see if there are ways to simplify it. When creating this solution for you, I first gathered the words into a multidimensional array, which involved a complex loop to access the individual words. It worked, but I thought there must be a simpler way of gathering the words into a simple array in the first place. It took a couple of attempts before I hit on array_merge(). Initially, I used array_push(), but that still created a multidimensional array.

One very important technique is to use echo or print_r() to examine the results of a function or loop. It helps enormously if you can see the results of each stage of the operation.

The rest simply comes down to a lot of reading. When I first started doing PHP, I bought "Programming PHP" by Rasmus Lerdorf and Kevin Tatroe. It's not an easy book to sit down and read from cover to cover, because it's rather like a grammar book. However, I did read the first half of the book to get an overview of the core features of the language. Since then, I keep it close at hand. It's one of the most heavily-thumbed and dog-eared books in my collection. You don't even need a copy of the book. If you go through the Strings and Arrays sections of the PHP online manual, you will learn a huge amount of useful material. I don't keep everything in my head, but I usually know where to look up details of the function I'll need.

Keep at it. You'll get there in the end.