How to insert a space between Chinese character and English character?

  • A+
Category:Languages

I have a statement where Chinese character and English character are next to each other:

我Love Perl 6哈哈 

I want to insert a space between Chinese character and English character:

我 Love Perl 6 哈哈 

I search that /u4e00-/u9fa5 represent Chinese character:

'哈' ~~ /<[/u4e00../u9fa5]>/ 

but this result in:

Potential difficulties: Repeated character (0) unexpectedly found in character class at line 2 ------> '哈' ~~ /<[/u4e00../⏏u9fa5]>/ 

so how to match a Chinese character?

 


The main problem is that /u is not a valid escape.

> "/u4e00" ===SORRY!=== Error while compiling: Unrecognized backslash sequence: '/u' ------> "/⏏u4e00" 

/x is though.

> "/x4e00" 一 

At any rate, the character class you are trying to use doesn't cover all Chinese characters.

> '㒠' ~~  /<[/x4e00../x9fa5]>/  Nil 

What you probably want is to match on a script.

> '㒠' ~~  /<:Han>/ 「㒠」 

This has the benefit that you don't have to keep changing your character class every time a new set of characters gets added to Unicode.


At any rate you could do any of the following

# store in $0 and $1 say S/(<:Han>)(<:Latin>)/$0 $1/ given '我Love Perl 6哈哈' say S{(<:Han>)(<:Latin>)} = "$0 $1" given '我Love Perl 6哈哈' # same with subst say '我Love Perl 6哈哈'.subst: /(<:Han>)(<:Latin>)/, {"$0 $1"}  # only match between the two say S/<:Han> <( )> <:Latin>/ / given '我Love Perl 6哈哈' say S{<:Han> <( )> <:Latin>} = ' ' given '我Love Perl 6哈哈' 

To change the value in a variable use s/// or .=subst

my $v = '我Love Perl 6哈哈';  $v ~~ s/(<:Han>)(<:Latin>)/$0 $1/; $v ~~ s{(<:Han>)(<:Latin>)} = "$0 $1"; $v ~~ s/<:Han> <()> <:Latin>/ /;  $v .= subst: /(<:Han>)(<:Latin>)/, {"$0 $1"}; $v .= subst: /<:Han> <()> <:Latin>/,' '; 

Note that <( causes everything to be ignored before it, and )> does the same for everything after it. (can be used individually).

You may want to use an inverted match instead for the character that is following.

S/<:Han> <( )> [ <!:Han> & <!space> ]/ / 

(Match a character that is at the same time not Han and not a space.)

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: