Scripts - Tutorials - Forum - Downloads - Showcase - Contact

Artikels

VPN vergelijken

Algemeen

Beginpagina

FAQ

Grafische worm (243)

Links

Nieuwsartikels

Nieuwsarchief

Boeken programmeren

Overzicht

Samenwerken

Webhosting

Zoek op Sitemasters

Leden

Registreren

Ledenlijst

Ons team

Links

webhostingtop10.be

Sociale media

Follow @sitemasters

Sitemasters

Adverteren op Sitemasters?

Contacteer ons

RSS

Link naar ons

Donaties

Poll

Je moet ingelogd zijn om te stemmen.

Statistieken

Linkpartners

Forum

Categorieën > PHP

[Regex] URI (Opgelost)

MiST - 19/08/2009 00:07 (laatste wijziging 19/08/2009 00:09)
Lid		Hello, Hier is de uitdaging voor de übergeek (Regexen zijn zooooo de Heilige graal ) Ik ben een shoutbox aan het maken (met ZF, lekker ). Daar is niets aan. MAAR! Ik wil dat de ingevoerde tekst afgezocht wordt naar URI's om die dan te formatten als HTML A-tags. Ik heb een hele mooie regex gemaakt tot nu toe die voor heel veel gevallen werkt. Maar hij heeft nog enkele kinderziektes ook. aanschouw mijn joekel! php code - Bekijk de code zonder highlighting - Klap code in $pattern = '#((http\|https\|ftp)://)([\w\d]+\.)([\w\d]+\.\w{2,4})(/[\w\d\?&=\+\\.-])#si'; $pattern = '#((http\|https\|ftp)://)([\w\d]+\.)([\w\d]+\.\w{2,4})(/[\w\d\?&=\+\\.-])#si'; Die wordt vervangen met: php code - Bekijk de code zonder highlighting - Klap code in $replacement = preg_match('#((http\|https\|ftp)://)#si', '$1$3$4$5') ? '<a href="$1$3$4$5">$1$3$4$5</a>' : '<a href="http://$3$4$5">$3$4$5</a>'; $replacement = preg_match('#((http\|https\|ftp)://)#si', '$1$3$4$5') ? '<a href="$1$3$4$5">$1$3$4$5</a>' : '<a href="http://$3$4$5">$3$4$5</a>'; Dit werkt voor heel veel adressen (en ik zou nog IP-adressen en poortnummers kunnen toevoegen. Het jammere is... * domeinen van emails worden geformat... (dat wil ik niet) * als je na een punt geen spatie typt, heb je een link. Nuja, das niet zo'n probleem. Dan moeten de mensen maar leren typen een hoedje plaatsen lost niets op, want dan worden URLS in het midden van de tekst niet geformat. Iemand een idee?

8 antwoorden

Gesponsorde links

Richard - 19/08/2009 00:25
Crew algemeen		(?<!@) misschien? ;p Erg simpele oplossing maar dat is precies waar je om vraagt :]

darsstar - 19/08/2009 00:31 (laatste wijziging 19/08/2009 12:35)
Nieuw lid		probeer de volgende eens (nee, ik heb deze niet gemaakt): {(?<=\b)((?:https?\|ftp)://\|www\.)[\w.]+[;#&/~=\w+()?.,:%-][;#&/~=\w+(-]}i verder even over jouw 'joekel'. ((http\|https\|ftp)://) http://https://ftp://http://www.example.com zal dus ook werken. probeer een plus teken verder kun je ook non-captured subpattern maken door na het openings haakje "?:" neer te zetten, de inhoud komt dan dus niet meer in een $n terecht. ([\w\d]+\.)* \d zit volgens mij mij \w inbegrepen ([\w\d]+\.\w{2,4}) die bovenstaande (/[\w\d\?&=\+\\.-])* zie bovenstaande ?, +, * en . hoeven binnen square brackets niet gebackslashes worden. met wat verbeteringen krijg ik de volgende regex: #((?:(?:http\|https\|ftp)://)?(?:[\w]+\.)(?:[\w]+\.\w{2,4})(?:/[\w?&=+.-]))#si

Richard - 19/08/2009 09:14
Crew algemeen		Noem je dat verbeterd? :') php code - Bekijk de code zonder highlighting - Klap code in '~(?<!@)(?:(?:(?:ht\|f)tps?\|ftp)://)?(?:\w+\.)(?:\w+\.[a-z]{2,6})(?:/[\w?&=+.-])~i' '~(?<!@)(?:(?:(?:ht\|f)tps?\|ftp)://)?(?:\w+\.)(?:\w+\.[a-z]{2,6})(?:/[\w?&=+.-])~i'

MiST - 19/08/2009 12:47 (laatste wijziging 19/08/2009 12:48)
Lid		We zijn er bijna denk ik... Het truukje lijkt wel niet te werken... Wat gaat er verkeerd? wat ik nu heb: php code - Bekijk de code zonder highlighting - Klap code in $patterns[0] = '~(?<!@)((?:(?:ht\|f)tps?\|ftp)://)?(\w+\.)(\w+\.[a-z]{2,6})((/[\w\d?&=+.-]))~i' $patterns[0] = '~(?<!@)((?:(?:ht\|f)tps?\|ftp)://)?(\w+\.)(\w+\.[a-z]{2,6})((/[\w\d?&=+.-]))~i' met replacement: php code - Bekijk de code zonder highlighting - Klap code in $replacements[0] = preg_match('#(?:(?:(?:ht\|f)tps?\|ftp)://)#si', '$1$2$3$4') ? '<a href="$1$2$3$4">$1$2$3$4</a>' : '<a href="http://$2$3$4">$2$3$4</a>'; $replacements[0] = preg_match('#(?:(?:(?:ht\|f)tps?\|ftp)://)#si', '$1$2$3$4') ? '<a href="$1$2$3$4">$1$2$3$4</a>' : '<a href="http://$2$3$4">$2$3$4</a>'; Wat er nu gebeurt. Bij een emailadres wordt nu het eerste karakter van het domein overgeslagen, maar de rest wordt nog steeds geformat dus alias@domein.extensie wordt alias@domein.extensie Da's niet echt de bedoeling Kan iemand mij nog verduidelijken: * wat de i achteraan doet? de tilde is obviously de delimiter * Hoe dat truukje (met de @) eigenlijk zou moeten werken? Ik ben geen Regex-Held, JeXuS is dat, zo te zien aan zijn omschrijving... Alle hulp is welkom though.

Richard - 19/08/2009 12:55
Crew algemeen		Jaja, ik zie het probleem al, raar dat ik daar niet aan dacht :-) Meestal wordt dit opgelost met: (?<=^\|\s) in plaats van die (?<!@)

MiST - 19/08/2009 13:58
Lid		probeert ziet dat het werkt blij JeXuS, jij bent mijn held! Als je nu ook nog even kan uitleggen wat het allemaal exact doet... Ik begrijp graag alles wat ik gebruik, eerder dan het klakkeloos over te nemen (zoals nu een beetje 8-))

Richard - 19/08/2009 14:04

Crew algemeen

php code - Bekijk de code zonder highlighting - Klap code in

                (?<=                 # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
                        # Match either the regular expression below (attempting the next alternative only if this one fails)
      ^                    # Assert position at the beginning of a line (at beginning of the string or after a line break character)
   |                    # Or match regular expression number 2 below (the entire group fails if this one fails to match)
      \s                   # Match a single character that is a �whitespace character� (spaces, tabs, line breaks, etc.)
)
(                    # Match the regular expression below and capture its match into backreference number 1
   (?:                  # Match the regular expression below
                           # Match either the regular expression below (attempting the next alternative only if this one fails)
         (?:                  # Match the regular expression below
                                 # Match either the regular expression below (attempting the next alternative only if this one fails)
               ht                   # Match the characters �ht� literally
            |                    # Or match regular expression number 2 below (the entire group fails if this one fails to match)
               f                    # Match the character �f� literally
         )
         tp                   # Match the characters �tp� literally
         s                    # Match the character �s� literally
            ?                    # Between zero and one times, as many times as possible, giving back as needed (greedy)
      |                    # Or match regular expression number 2 below (the entire group fails if this one fails to match)
         ftp                  # Match the characters �ftp� literally
   )
   ://                  # Match the characters �://� literally
)?                   # Between zero and one times, as many times as possible, giving back as needed (greedy)
(                    # Match the regular expression below and capture its match into backreference number 2
   \w                   # Match a single character that is a �word character� (letters, digits, etc.)
      +                    # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   \.                   # Match the character �.� literally
)*                   # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?:                  # Match the regular expression below
   \w                   # Match a single character that is a �word character� (letters, digits, etc.)
      +                    # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   \.                   # Match the character �.� literally
   [a-z]                # Match a single character in the range between �a� and �z�
      {2,6}                # Between 2 and 6 times, as many times as possible, giving back as needed (greedy)
)
(?:                  # Match the regular expression below
   (                    # Match the regular expression below and capture its match into backreference number 3
      /                    # Match the character �/� literally
      [\w?&=+*.-]          # Match a single character present in the list below
                              # A word character (letters, digits, etc.)
                              # One of the characters �?&=+*�
                              # The character �.�
                              # The character �-�
         *                    # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
   )*+                  # Between zero and unlimited times, as many times as possible, without giving back (possessive)
)
            
(?<=                 # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
                        # Match either the regular expression below (attempting the next alternative only if this one fails)
      ^                    # Assert position at the beginning of a line (at beginning of the string or after a line break character)
   |                    # Or match regular expression number 2 below (the entire group fails if this one fails to match)
      \s                   # Match a single character that is a �whitespace character� (spaces, tabs, line breaks, etc.)
)
(                    # Match the regular expression below and capture its match into backreference number 1
   (?:                  # Match the regular expression below
                           # Match either the regular expression below (attempting the next alternative only if this one fails)
         (?:                  # Match the regular expression below
                                 # Match either the regular expression below (attempting the next alternative only if this one fails)
               ht                   # Match the characters �ht� literally
            |                    # Or match regular expression number 2 below (the entire group fails if this one fails to match)
               f                    # Match the character �f� literally
         )
         tp                   # Match the characters �tp� literally
         s                    # Match the character �s� literally
            ?                    # Between zero and one times, as many times as possible, giving back as needed (greedy)
      |                    # Or match regular expression number 2 below (the entire group fails if this one fails to match)
         ftp                  # Match the characters �ftp� literally
   )
   ://                  # Match the characters �://� literally
)?                   # Between zero and one times, as many times as possible, giving back as needed (greedy)
(                    # Match the regular expression below and capture its match into backreference number 2
   \w                   # Match a single character that is a �word character� (letters, digits, etc.)
      +                    # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   \.                   # Match the character �.� literally
)*                   # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?:                  # Match the regular expression below
   \w                   # Match a single character that is a �word character� (letters, digits, etc.)
      +                    # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   \.                   # Match the character �.� literally
   [a-z]                # Match a single character in the range between �a� and �z�
      {2,6}                # Between 2 and 6 times, as many times as possible, giving back as needed (greedy)
)
(?:                  # Match the regular expression below
   (                    # Match the regular expression below and capture its match into backreference number 3
      /                    # Match the character �/� literally
      [\w?&=+*.-]          # Match a single character present in the list below
                              # A word character (letters, digits, etc.)
                              # One of the characters �?&=+*�
                              # The character �.�
                              # The character �-�
         *                    # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
   )*+                  # Between zero and unlimited times, as many times as possible, without giving back (possessive)
)
 

:-)

MiST - 19/08/2009 17:40
Lid		w00t!

Gesponsorde links

Dit onderwerp is gesloten.

Actiefste leden van de maand

Actieve forumberichten