Skip to main content
Legend
August 9, 2021
Answered

How to extract parts of a filename using RegExp?

  • August 9, 2021
  • 3 replies
  • 1075 views

I don't use RegExp often, but in most cases I can handle it.

 

Now I have a list of filenames from cameras in the format
_MG_4991.cr2
DSC_0986.JPG

etc., in this case, between the digital index and the file extension there can be arbitrary characters (as a rule, the photographer's comments)

 

I want to parse these names into parts so that I get a separate prefix specific to the camera (for example, _MG_ or DSC_ or NEF_), the numerical value of the frame number and the file extension. At the same time, I need to ignore the comment of the photographer.

 

I read the RegExp documentation for a long time and ended up doing this:

 

var init = decodeURI(etTarget.path[i].name).match(new RegExp('([\D]*)(\d+)[\D]*(\.[\D]*)'))

 

(where etTarget.path [i] is an element of an array of objects of the File type). But it doesn't work.

I hope there are people here who are more advanced in working with strings and I will get an answer 🙂

This topic has been closed for replies.
Correct answer Kukurykus

 

f=['_MG_4991CommentTwo.cr2', 'DSC_0986CommentOne.JPEG', 'NEF_1234CommentThree.nef'], obj={}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})(\w+)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))
f = ['_MG_4991Comment 1.cr2', 'DSC_0986Comment 2.JPEG', 'NEF_1234Comment 3.nef'], obj = {}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})([\w ]*)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))

 

3 replies

Legend
August 9, 2021

ГХМ!!!!

 

try {
fls=['AAA_1234 школа 10.jpg', 'DSC_0986Comment One.JPEG', 'NEF_1234CommentThree.nef'], obj={}
while(fls.length)eval(fls.shift().replace(/(\w{4}\d{4})(\w+)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))

alert(obj.toSource())
} catch (e) { alert(e); }

var s = "AAA_1234 школа 10.jpg"

var regExp = /([\D]*)(\d+)[\D]*(\.[\D]*)/
init = regExp.exec(decodeURI(s));

alert(init.toSource())

 

???

 

jazz-yAuthor
Legend
August 9, 2021

I have a limited selection of possible comments and filenames. In any case, now I understand the approach and I can fix everything myself.

 

Меня больше удивляет, что в этом случае не работает match, я ожидал что он вернет тот же результат что exec. Понятно, что дело во мне, а не в match, но не пойму где ошибка

 

UPD. Понял. Кавычки были лишними 🙂

Kukurykus
Legend
August 9, 2021

That was a standard approach covering scope of typical cases. That's not hard to improve that:

 

eval('AAA_1234 школа 10.jpg'.replace(/(\w{4}\d{4})([\w школа]*)(\..{3,4}$)/i,'({"$1$3":"$2"})')).toSource()

 

jazz-yAuthor
Legend
August 9, 2021
var regExp = /([\D]*)(\d+)[\D]*(\.[\D]*)/,
init = regExp.exec(decodeURI(etTarget.path[0].name));

so it seems to work too

Kukurykus
Legend
August 9, 2021

Yes, originally I kept your code, but then found it fails for some names so I created mine.

Kukurykus
KukurykusCorrect answer
Legend
August 9, 2021

 

f=['_MG_4991CommentTwo.cr2', 'DSC_0986CommentOne.JPEG', 'NEF_1234CommentThree.nef'], obj={}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})(\w+)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))
f = ['_MG_4991Comment 1.cr2', 'DSC_0986Comment 2.JPEG', 'NEF_1234Comment 3.nef'], obj = {}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})([\w ]*)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))

 

Legend
August 9, 2021
quote

 

f=['_MG_4991CommentTwo.cr2', 'DSC_0986CommentOne.JPEG', 'NEF_1234CommentThree.nef'], obj={}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})(\w+)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))
f = ['_MG_4991Comment 1.cr2', 'DSC_0986Comment 2.JPEG', 'NEF_1234Comment 3.nef'], obj = {}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})([\w ]*)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))

 


By @Kukurykus

 

messy again

f=['_MG_4991Comment Two.cr2', 'DSC_0986CommentOne.JPEG', 'NEF_1234CommentThree.nef'], obj={}
while(f.length)eval(f.shift().replace(/(\w{4}\d{4})(\w+)(\..{3,4}$)/i,'obj["$1$3"]="$2"'))

alert(obj.toSource())
Kukurykus
Legend
August 9, 2021

The first version was for comments with [a-z][0-9]_, another also for spaces.