Multiple simultaneous substring replacements in Java

  • A+
Category:Languages

(I come from the python world, so I apologise if some of the terminology I use jars with the norm.)

I have a String with a List of start/end indices to replace. Without getting too much into detail, consider this basic mockup:

String text = "my email is foo@bar.com and my number is (213)-XXX-XXXX" List<Token> findings = SomeModule.someFnc(text); 

And Token has the definition of

class Token {     int start, end;     String type; } 

This List represents start and end positions of sensitive data that I'm trying to redact.

Effectively, the API returns data that I iterate over to get:

[{ "start" : 12, "end" : 22, "type" : "EMAIL_ADDRESS" }, { "start" : 41, "end" : 54, "type" : "PHONE_NUMBER" }] 

Using this data, my end goal is to redact the tokens in text specified by these Token objects to get this:

"my email is [EMAIL_ADDRESS] and my number is [PHONE_NUMBER]" 

The thing that makes this question non-trivial is that the replacement substrings aren't always the same length as the substrings they're replacing.

My current plan of action is to build a StringBuilder from text, sort these IDs in reverse order of start indices, and then replace from the right end of the buffer.

But something tells me there should be a better way... is there?

 


This approach works:

import java.util.ArrayList; import java.util.List;  public class Test {     public static void main(String[] args) {         String text = "my email is foo@bar.com and my number is (213)-XXX-XXXX";          List<Token> findings = new ArrayList<>();         findings.add(new Token(12, 22, "EMAIL_ADDRESS"));         findings.add(new Token(41, 54, "PHONE_NUMBER"));          System.out.println(replace(text, findings));     }      public static String replace(String text, List<Token> findings) {         int position = 0;         StringBuilder result = new StringBuilder();          for (Token finding : findings) {             result.append(text.substring(position, finding.start));             result.append('[').append(finding.type).append(']');              position = finding.end + 1;         }          return result.append(text.substring(position)).toString();     } }  class Token {     int start, end;     String type;      Token(int start, int end, String type) {         this.start = start;         this.end = end;         this.type = type;     } } 

Output:

my email is [EMAIL_ADDRESS] and my number is [PHONE_NUMBER] 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: