<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Plagiarism Checker</title>
	<atom:link href="http://www.plagiarismchecker.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismchecker.net</link>
	<description>Plagiarism information and resources</description>
	<lastBuildDate>Thu, 28 Mar 2013 22:03:47 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5</generator>
		<item>
		<title>4. Plagiarism detection methods &#8211; conclusions and recommendations</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/4-plagiarism-detection-recommendations/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/4-plagiarism-detection-recommendations/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:48:03 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2383</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/4-plagiarism-detection-recommendations/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>4.1        Conclusions As higher educational institutes pursue a greater standard of accountability, the broad spectrum of plagiarism detection mechanisms, increasingly complex definition of plagiarism practises, and high degree of inconsistency in replicability and interpretation in analytical outputs is diluting the<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/4-plagiarism-detection-recommendations/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<h2>4.1        Conclusions</h2>
<p>As higher educational institutes pursue a greater standard of accountability, the broad spectrum of plagiarism detection mechanisms, increasingly complex definition of plagiarism practises, and high degree of inconsistency in replicability and interpretation in analytical outputs is diluting the accuracy and adequacy of such measures.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  In fact, one of the most important elements that has yet to be adequately addressed in plagiarism research, theory, and practise is the function of semantic analysis and qualitative factors in the inconsistent capacity for dissecting student works.  Past evidence in this field (See Youmans<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a> and Zeman et al.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>) has demonstrated that there is a significant degree of variation in plagiarism outputs when assessing longitudinal evidence relating to multi-functional and multi-faceted writing assignments.  Specifically, the likelihood of plagiarism by virtue of topic or level is increased according to the relative ‘difficulty’ of the subject or type of assignment in question.  This degree of variability is an important consideration when assessing plagiarism monitoring and detection instruments, particularly when student enrolment and their continued higher education depends upon a passing output score on their university’s instrument of choice.</p>
<p>This research has demonstrated two primary dimensions of plagiarism monitoring and assessment: intrinsic and extrinsic.  Focusing on grammaticality and linguistic relationships and voice, intrinsic plagiarism detection is quickly becoming a valuable mechanism for expedient, corpus-free monitoring of students’ work.  Alternatively, the more traditional extrinsic model of plagiarism detection requires what has become a proprietary factor: a comprehensive database of comparative resources.  Whilst the analytical approaches employed in each of these techniques may vary, their primary objectives do not: to identify passages or concepts which a student has plagiarised from an unreferenced source in an effort to present them as their own.  Yet this objective itself is indicative of a fundamental conflict at the core of detection mechanisms as academics find that the identification and representation of intention is extremely difficult to qualify.  Given that students have recognised the potential deficiencies of a copy-paste strategy in plagiarism obfuscation, new techniques focusing on intra-corporal manipulation and multi-language translation and manipulation are much more difficult to monitor and assess.</p>
<p>This research has provided a critical analysis of many different studies in this field, demonstrating a persistent focus on modelling and approach, whilst simultaneously failing to sufficiently address the core limitations and practical constraints of these multi-faceted models.  With fuzzy analysis and semantic reasoning quickly becoming figureheads in more intuitive system design, it is evident from these findings that the nature of plagiarism detection is adapting in order to meet the challenge of subverting more complex student efforts.  Seemingly, the foundations of plagiarism detection are based upon a stepwise progression, whereby the mechanisms of textual copying and manipulation are identified and then prevented as academics pursue the most comprehensive of monitoring resources.  For students, the knowledge of more in-depth monitoring practises may act as a fundamental deterrent; however, it is evident that when faced with academic deviance and scholastic failure, deviance is likely to become an viable consideration.</p>
<h2>4.2        Recommendations</h2>
<p>The primary aim of this research was to assess the breadth of plagiarism monitoring and detection resources that are currently being employed in higher education institutions across the developed world.  With globalised scholasticism a quickly evolving phenomenon, the scope of student ethics and value systems is magnified, requiring a much more definitive stance against plagiarism and its multiple iterations.  For this reason, the findings in this analysis have revealed a continuum of student-oriented adaptations that are quickly altering the scope of monitoring and detection platforms.  Accordingly, there are three primary dimensions of the effective plagiarism deterrence and detection system that have been extracted from this breadth of academic evidence.  The following is a brief overview of this stepwise process which must consider both student and academic interests in its implementation:</p>
<ul>
<li>Education and Deterrence:  Managing plagiarism in HEIs involves effective deterrence techniques.  Whilst a zero tolerance policy is effective, it is evident from the findings of such researchers as Postle (2009)<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a> and Introna and Hayes (2011)<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a> that student decision making is oftentimes based upon concerns that are beyond the scope of policy constraints.  Therefore, by instilling a robust sense of ethics, responsibility, and academic honour amongst the student population, it is more likely that the incidence of plagiarism will naturally decline.
<ul>
<li>The Grey Area:  One of the major challenges in anti-plagiarism policymaking is that conceptual and textual copying are frequently two different elements in student writing.  Whilst a student might summarise and neglect a particular concept or idea, the ability to detect such offences is mitigated by the depth of analysis within the system itself.  Conversely, the student might copy a published insight in an attempt to convey a robust idea or concept without attribution (due to oversight or otherwise honest mistakes).  The challenge for educators is to determine what constitutes plagiarism and how can these standards be enforced universally.  This grey area paradox may ultimately reflect in an intensified, formal rigidity which erroneously impacts upon the educational experience of ethically superior, honest students.</li>
</ul>
</li>
<li>Intrinsic Plagiarism Analysis:  By starting with the student’s voice itself and leveraging a corpus of student work in the assessment of possible plagiarism, it becomes possible to statistically identify areas in which the student may have borrowed or taken ideas without appropriately sourcing the work of an outside author.  The ability to assess student works on the basis of grammatical and semantic consistency is an important step forward in plagiarism detection and requires a much lower system demand that extrinsic, multi-database sourcing entails.</li>
<li>Extrinsic, Semantic Plagiarism Analysis:  Whilst most of the current service providers base their platform on extrinsic plagiarism detection, the underlying value of this approach is limited by the creativity and skill associated with modern student plagiarism.  Therefore, the semantic, global thesaurus-based analysis of student works is likely to yield a much more accurate finding regarding the manipulation of language in order to copy or summarise the works of others.  From multi-lingual to native language analysis, the broad spectrum of this analysis is likely to make detection complicated and require a comprehensive database (e.g. university resources including journals, books, etc.) that can span across multinational environments.</li>
</ul>
<p>These three phases of plagiarism subversion, monitoring and detection are only a first stage in the analytical protocols that must be employed at the core of any university.  With students continuing to engage in more subversive behaviour, the likelihood of eliminating plagiarism through any singular strategy is minimal.  However, by broadening the scope of assessment, enhancing the depth of analysis, and re-focusing existing protocols on a more complex spectrum of dimensions, it is hypothesised that the incidence of plagiarism will begin to decrease over time.</p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> K. Postle, ‘Detecting and Deterring Plagiarism in Social Work Students: Implications for Learning for Practice,’ (2009) <i>Social Work Education</i>, Vol. 28, No. 4, p. 358.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> R.J. Youmans, ‘Does the Adoption of Plagiarism-Detection Software in Higher Education Reduce Plagiarism?’ (2011) <i>Studies in Higher Education</i>, Vol. 36, No. 7, p. 760.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> L.D. Zeman, J.A. Steen, and N.M. Zeman, ‘Originality Detection Software in a Graduate Policy Course: A Mixed Methods Evaluation of Plagiarism,’ <i>Journal of Teaching in Social Work</i>, Vol. 31, No. 4, p. 439.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> K. Postle, ‘Detecting and Deterring Plagiarism in Social Work Students: Implications for Learning for Practice,’ (2009) <i>Social Work Education</i>, Vol. 28, No. 4.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> L.D. Introna and N. Hayes ‘On Sociomaterial Imbrications: What Plagiarism Detection Systems Review and Why it Matters,’ (2011).</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/4-plagiarism-detection-recommendations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Text Summarisation and Fuzzy Swarm Techniques for Plagiarism Detection</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/text-summarisation/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/text-summarisation/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:46:18 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2378</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/text-summarisation/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/uploads/2013/03/swarm-150x150.png" class="alignleft wp-post-image tfe" alt="swarm" title="" /></a>Other researchers in this field have also proposed mechanisms for text analysis and data parsing through the use of a fuzzy swarm tool.  Binwahlan et al., for example, propose that automatic text summarisation offers an opportunity to ‘condense the source<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/text-summarisation/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>Other researchers in this field have also proposed mechanisms for text analysis and data parsing through the use of a fuzzy swarm tool.  Binwahlan et al., for example, propose that automatic text summarisation offers an opportunity to ‘condense the source text by extracting its most important content that meet’s a user’s or application’s needs’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  Within the document itself, the diversity-based selection process controls for redundancy through maximal marginal relevance calculations, wherein the centrality involves the summation of three features including the similarity between the sentence in hand and each document sentence, shared friends (the group of sentences which are similar to both sentences) and shared n-grams (group of n-grams which are contained in both sentences).<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>  Procedurally, the Binwahlan et al. methodology involves the input document, pre-processing, features extraction, sentence clustering and binary tree building, sentence order in binary tree, and summary generation.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>  When applied to the document, the swarm-based summarisation is used to select the top sentences which have the highest scores according to sentence centrality, the title feature, the word sentence score, the key word feature, and the similarity to the first sentence.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  Figure 2 offers a stepwise model of the assessment process, incorporating both features and swarm-based sentence analysis into a singular diversity-based summarisation output.</p>
<p><img class="alignnone size-full wp-image-2379" alt="swarm" src="http://www.plagiarismchecker.net/wp-content/uploads/2013/03/swarm.png" width="331" height="270" /></p>
<p align="center">Figure 2: MMI Diversity-Based Text Summarisation and Swarm-Based Text Summarisation</p>
<p>            Extending beyond the traditional scope of the MMI summarisation model, Binwahlan et al. propose the use of a hybrid model which includes the diversity based method, the fuzzy swarm based method, and a combination of swarm-diversity method and a swarm-only method.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  The resultant, combinative model is exemplified in Figure 3, highlighting three tiered features analysis process which results in three distinct summaries that are then input into the selector procedure and result in a final, summative output based upon textual importance<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>.  Whilst this model itself is designed for large scale text analysis (e.g. legal briefs), the relevance of these techniques to the identification of internal plagiarism is significant and reflects a unique opportunity to establish similarity dimensions that are content-derived and semantically based.</p>
<p><img class="alignnone size-full wp-image-2380" alt="swarm2" src="http://www.plagiarismchecker.net/wp-content/uploads/2013/03/swarm2.png" width="291" height="341" /></p>
<p align="center">Figure 3: Fuzzy Swarm Diversity Hybrid Model for Automatic Text Summarization</p>
<p>            Within the dimensions of text summarisation technologies, Alguliev et al. emphasise that there are four distinct classifications of the textual output including descriptive, evaluative, indicative, and informative.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a>  Most relevant to the current study, the evaluative output represents a critical response to the source, allowing for summative elements to be compared or assessed according to some control variable(s).  Within this model, multi-document summarisation allows for multiple texts to be summarised simultaneously, clustering outputs and allowing for knowledge synthesis or discovery.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn8">[8]</a>  The proposed Alguliev et al. text summarisation model involves a core problem of assessing a given textual input in order to reproduce the primary content in a summative form according to three core factors including relevance, redundancy, and length.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn9">[9]</a>  Based upon the quantitative assessment of similarity throughout the input text, the researchers utilise a branch-and-bound algorithm to find the optimal solution (hard problem) and incorporate a particle swarm optimisation algorithm to parse and select the fitness of each intra-textual element.  This particular methodology involves comparison of each particle according to its position in the continuous <i>n</i> dimensional search space, whereby the best previous position and the velocity are recorded and analysed according to an iterative global best.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn10">[10]</a>  Whilst the Alguliev output is an original, summative document, the particle swarm methodology is validated as an effective assessment for relevance, redundancy, and length in textual comparisons, a finding which has particular implications for plagiarism assessment.</p>
<p>Symbolic data, such as linguistic outputs, represents a complex and oftentimes dissociated spectrum of elements and dimensions which Yang et al. propose can be critically assessed through the use of neural networks and self-organising maps (SOM).<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn11">[11]</a>  Specifically, the SOM ‘uses the neighbourhood interaction set to approximate lateral neural interaction and discover the topological structures hidden within the data’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn12">[12]</a>  Symbolic data types differ from numerical evidence; therefore, Yang et al. apply a dissimilarity/similarity measure in order to distinguish between component features.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn13">[13]</a>  Within this complex interweaving of algorithmic relationships, the researchers strive to extend the SOM beyond purely quantitative definitions, applying the symbolic relationship to allow for the analysis of distance measures and dissimilarity factors in membership-based comparisons.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn14">[14]</a>  Whilst the specific foundations of this research are based upon social clustering problems, the relevance of such advanced neural networking techniques to plagiarism detection is evident and would likely serve as a unique step forward towards resolving both semantic and corpus based dilemmas.</p>
<h2>3.4 Summary</h2>
<p>The methods introduced in this chapter employ an advanced, multi-dimensional technique which is based upon the concepts of fuzzy sets, swarm modelling, and neural networks.  Considering that knowledge itself is self-organising, the ability to conceptually and contextually cluster inter-textual characteristics across singular and corpus-based documents is a unique advantage for identifying similarities in plagiarism detection.  The textual summarisation models establish a synthetic foundation for semantic and contextual comparisons, whereby conditions of plagiarism can be explicitly outlined.  Whilst all of these models have yet to be adequately applied in an academic setting, their vision and perspective is distinctive and valuable for re-assessing the current limitations of existing similarity and comparison models.</p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> M.S. Binwahlan, N. Salim, and L. Suanmali, ‘Fuzzy Swarm Diversity Hybrid Model for Text Summarization,’ <i>Information Processing and Management</i>, Vol. 46, pp. 571-588.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> M.S. Binwahlan, N. Salim, and L. Suanmali, ‘Fuzzy Swarm Diversity Hybrid Model for Text Summarization,’ <i>Information Processing and Management</i>, Vol. 46, p. 572.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> M.S. Binwahlan, N. Salim, and L. Suanmali, ‘Fuzzy Swarm Diversity Hybrid Model for Text Summarization,’ <i>Information Processing and Management</i>, Vol. 46, p. 574.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> M.S. Binwahlan, N. Salim, and L. Suanmali, ‘Fuzzy Swarm Diversity Hybrid Model for Text Summarization,’ <i>Information Processing and Management</i>, Vol. 46, p. 578.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> M.S. Binwahlan, N. Salim, and L. Suanmali, ‘Fuzzy Swarm Diversity Hybrid Model for Text Summarization,’ <i>Information Processing and Management</i>, Vol. 46, p. 581.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> M.S. Binwahlan, N. Salim, and L. Suanmali, ‘Fuzzy Swarm Diversity Hybrid Model for Text Summarization,’ <i>Information Processing and Management</i>, Vol. 46, p. 583.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> R. M. Alguliev, R.M. Aliguliyev, M.S. Hajirahimova, and C.A. Mehdiyev, ‘MCMR: Maximum Coverage and Minimum Redundant Text Summarization Model,’ <i>Expert Systems With Applications</i>, Vol. 38, pp. 14514-14522.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref8">[8]</a> R. M. Alguliev, R.M. Aliguliyev, M.S. Hajirahimova, and C.A. Mehdiyev, ‘MCMR: Maximum Coverage and Minimum Redundant Text Summarization Model,’ <i>Expert Systems With Applications</i>, Vol. 38.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref9">[9]</a> R. M. Alguliev, R.M. Aliguliyev, M.S. Hajirahimova, and C.A. Mehdiyev, ‘MCMR: Maximum Coverage and Minimum Redundant Text Summarization Model,’ <i>Expert Systems With Applications</i>, Vol. 38, p. 14516.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref10">[10]</a> R. M. Alguliev, R.M. Aliguliyev, M.S. Hajirahimova, and C.A. Mehdiyev, ‘MCMR: Maximum Coverage and Minimum Redundant Text Summarization Model,’ <i>Expert Systems With Applications</i>, Vol. 38, p. 14518.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref11">[11]</a> M.S. Yang, W.L. Hung, and D.H. Chen, ‘Self-Organizing Map for Symbolic Data,’ <i>Fuzzy Sets and Systems</i>, Vol. 203.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref12">[12]</a> M.S. Yang, W.L. Hung, and D.H. Chen, ‘Self-Organizing Map for Symbolic Data,’ <i>Fuzzy Sets and Systems</i>, Vol. 203, p. 49</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref13">[13]</a> M.S. Yang, W.L. Hung, and D.H. Chen, ‘Self-Organizing Map for Symbolic Data,’ <i>Fuzzy Sets and Systems</i>, Vol. 203, p. 51-2.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref14">[14]</a> M.S. Yang, W.L. Hung, and D.H. Chen, ‘Self-Organizing Map for Symbolic Data,’ <i>Fuzzy Sets and Systems</i>, Vol. 203, p. 72.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/text-summarisation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3: Fuzzy Sets and Swarm-Based Comparisons</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-sets-swarm-based-comparisons/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-sets-swarm-based-comparisons/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:43:57 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2376</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-sets-swarm-based-comparisons/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>3.1 Introduction The following chapter offers a review of the emergent research associated with fuzzy logic and swarm-based technologies.  An emergent field in linguistic and textual analysis and summarisation, these techniques are only just now being applied to problems of<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-sets-swarm-based-comparisons/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<h2>3.1 Introduction</h2>
<p>The following chapter offers a review of the emergent research associated with fuzzy logic and swarm-based technologies.  An emergent field in linguistic and textual analysis and summarisation, these techniques are only just now being applied to problems of plagiarism and corpus-based comparisons.  Given that plagiarism itself is becoming more complex, oftentimes involving translation-based subterfuge, there is a distinct need for methodologies which are able to reconcile the contextual, conceptual, and semantic dimensions of similarity within a comparative framework.  These studies can be viewed as a positive step forward towards a resolution of the quantitative and qualitative relationships within academic documents.</p>
<h2>3.2 Fuzzy Plagiarism Detection, Translation, Semantics, and Linguistic Variables</h2>
<p>Citing an evolution of plagiarism detection techniques, Kent and Salim recognise that whilst the majority of early stage detection systems were based upon fingerprint matching, they have since been joined by evolved models utilising advanced clustering techniques or sylometry measurement and comparison.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  Based upon translation modelling and the distillation of the source text through pre-processing techniques such as stop word removal and stemming, Kent and Salim propose that even translated texts can be traced back to their original source through more comprehensive detail comparison and taxonomy assessment.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>  Advancing this analytical resource towards a more effective, fuzzy swarm model, the researchers propose that particle swarm optimisation can be used to assess five key features across the intra-textual content including sentence centrality, key word feature, first sentence similarity, title feature, and word sentence score.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>  This swarm intelligence model is integrated into the fuzzy logic algorithm in order to develop a summarization output.  Specifically, the explicit, ‘crisp’ numerical values obtained during the five sentence swarm assessment are then used as the input for the fuzzification process, resulting in a value output between 0 and 1, and a scalar interpretation of low, medium, and high correlation.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  Given the complexity of in-sentence relationships, Kent and Salim propose the use of more than 200 If-Then fuzzy rules such as the following:<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a></p>
<p>IF (WSS is L) and (SC is L) and (S_FD is M) and (SS_NG is L) and (KWRD is L) then (output is unimportant), whereby each acronym stands for one of the five scores in the swarm intelligence analysis.</p>
<p>Once these rules have been applied to the source text, each sentence is given a score that is based upon the fuzzy inference system and a final summary is obtained that is ‘based on the top-n highest score sentences where n is the compression rates of the documents which are determined by the users’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>  The most important contribution of the Kent and Salim research, however, is not necessarily the design of the fuzzy model; instead, it is the pursuit of plagiarism detection for both translated and semantically similar texts.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a>  Specifically, a similarity index between the predicates in the documents is calculated, whereby a synthesis of all possible predicate combinations from each sentence is generated and then compared across the corpus of potential documents.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn8">[8]</a>  The output is then based upon the detection of similarity between the object and subject in the sentences through semantic comparison, allowing analysts to identify plagiarism regardless of structural manipulations.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn9">[9]</a></p>
<p>Within the concept of fuzzy (linguistic) data analysis, Kaburlasos et al. propose that a graph matching protocol using neural networks can be used in object recognition in order to reflect the similarity or best match output between an input ‘graph’ and a stored ‘graph’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn10">[10]</a>  Although the researchers apply this analytical protocol to various hypothetical ‘graphing’ inputs, the relevance of the fuzzy lattice function for identifying similarity and consistency between textual outputs is significant.  Specifically, the researchers define a fuzzy set as a paired function which includes <i>U</i> as a universe of discourse and <i>m</i> as a membership function, allowing for the assessment of similarity space and equivalence relation output.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn11">[11]</a>  Extending such techniques to the field of linguistic processing, Alguilev et al. propose that similarity measures operate at the core of natural language processing, whereby the similarity between texts can be classified into four primary categories including word co-occurrence/vector-based methods, corpus-based methods, hybrid methods, and descriptive-feature-based methods.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn12">[12]</a>  Given that each document in question is likely to reflect a diverse spectrum of information which may or may not be manipulated towards a masking purpose, Alguilev et al. propose that effective summarisation methods are able to extract core evidence and compare these outputs with other similar, target documents in order to highlight redundancies and similarity-based relationships.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn13">[13]</a></p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> C.K Kent and N. Salim, ‘Web Based Cross Language Plagiarism Detection,’ <i>IEEE ICCIMS</i>, Vol. 2, pp. 199-204.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> C.K Kent and N. Salim, ‘Web Based Cross Language Plagiarism Detection,’ <i>IEEE ICCIMS</i>, Vol. 2.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1098.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1098.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1098.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1098.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1100.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref8">[8]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1101.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref9">[9]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection,’ <i>IEEE ICDASC</i>, Vol. 9, p. 1101.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref10">[10]</a> V.G. Kaburlasos, L. Moussiades, and A. Vakali, ‘Fuzzy Lattice Reasoning (FLR) Type Neural Computation for Weighted Graph Partitioning,’ <i>Neurocomputing</i>, Vol. 72.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref11">[11]</a> V.G. Kaburlasos, L. Moussiades, and A. Vakali, ‘Fuzzy Lattice Reasoning (FLR) Type Neural Computation for Weighted Graph Partitioning,’ <i>Neurocomputing</i>, Vol. 72, p. 2124.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref12">[12]</a> R.M. Alguliev, R.M. Alguliyev, and C.A. Mehdiyev, ‘Sentence Selection for Generic Document Summarization Using an Adaptive Differential Evolution Algorithm,’ <i>Swarm and Evolutionary Computation</i>, Vol. 1, p. 215</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref13">[13]</a> R.M. Alguliev, R.M. Alguliyev, and C.A. Mehdiyev, ‘Sentence Selection for Generic Document Summarization Using an Adaptive Differential Evolution Algorithm,’ <i>Swarm and Evolutionary Computation</i>, Vol. 1, p. 221</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-sets-swarm-based-comparisons/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.10: From Plagiarism Detection to Socio-Cultural Stereotyping, the New Debate</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/plagiarism-detection-socio/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/plagiarism-detection-socio/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:42:34 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2374</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/plagiarism-detection-socio/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>Whilst detecting multilingual plagiarism may represent a new dimension in assessment and analysis, there is a robust, emergent debate today in academia regarding cultural stereotyping and the prevalence of plagiarism amongst ESL and overseas students.  Sowden, for example, began with<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/plagiarism-detection-socio/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>
Whilst detecting multilingual plagiarism may represent a new dimension in assessment and analysis, there is a robust, emergent debate today in academia regarding cultural stereotyping and the prevalence of plagiarism amongst ESL and overseas students.  Sowden, for example, began with the cultural underpinnings of plagiarism and the concept of ‘communal ownership’, a potentially problematic dimension of socio-cultural diversity.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  Citing evidence from students in various cultures, Sowden identified incidences of plagiarism in which students found their behaviour inherently appropriate considering the tribute which it paid to their instructors, tutors, and sources.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>   Yet in spite of a range of examples, Sowden also cautions against stereotyping multilingual students, suggesting that instead, students should be universally encouraged to meet a similar standard, pursuing originality, yet anticipating conceptual mirroring and similarity by virtue of the topics and subjects that are being researched.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a></p>
<p>Other researchers take a less amenable position than Sowden in relation to the nature of stereotyping and educational concerns regarding foreign or EFL students.  For example, Ha, reflects that whilst many Western instructors are quick to stereotype foreign students as inherently poor scholars or poor writers, the degree of commitment to writing education across their multiple primary institutions has remained extremely variable.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  Further, Ha suggests that many institutions fail to take the time to consider the influence which culture, language, and identity have on student writing, attempting to prescribe more localised cultural values instead of addressing the root core of the problem: variability in education and cultural development.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  Citing the Sowden and Ha positions, Liu counters that to prescribe an inherently ‘culturally conditioned’ stereotype to any foreign student is to fail to address the pedagogical implications of plagiarism and to constrain any actual solution-oriented systems.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>  From language and writing development to in-classroom guidance and support, Liu argues that rather than ‘dwelling on issues that have few direct pedagogical implications’, it is imperative to focus on appropriate solutions that are oriented towards overcoming the problem, rather than excusing it.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a></p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> C. Sowden, ‘Plagiarism and the Culture of Multilingual Students in Higher Education Abroad,’ <i>ELT Journal</i>, (2005), p. 226.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> C. Sowden, ‘Plagiarism and the Culture of Multilingual Students in Higher Education Abroad,’ <i>ELT Journal</i>, (2005), p. 228.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> C. Sowden, ‘Plagiarism and the Culture of Multilingual Students in Higher Education Abroad,’ <i>ELT Journal</i>, (2005), p. 228-30.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> P.L. Ha, ‘Plagiarism and Overseas Students: Stereotypes Again?’ <i>ELT Journal</i>, (2006), pp. 76-78.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> P.L. Ha, ‘Plagiarism and Overseas Students: Stereotypes Again?’ <i>ELT Journal</i>, (2006), p. 78.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> D. Liu, ‘Plagiarism in ESOL Students: Is Cultural Conditioning Truly the Major Culprit?’ <i>ELT Journal</i>, (2005), pp. 234-241.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> D. Liu, ‘Plagiarism in ESOL Students: Is Cultural Conditioning Truly the Major Culprit?’ <i>ELT Journal</i>, (2005), p. 241.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/plagiarism-detection-socio/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.9: Cross Language/Multi-Lingual Plagiarism: The New Obfuscation</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/cross-language-plagiarism/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/cross-language-plagiarism/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:41:28 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2372</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/cross-language-plagiarism/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>Recognised as cross-language or translated plagiarism, Alzahrani et al. define this process as the translating and manipulating of a ‘natural language text from one language into another without proper referencing to the original source’.[1]  Based upon a summative, keyword search<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/cross-language-plagiarism/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>Recognised as cross-language or translated plagiarism, Alzahrani et al. define this process as the translating and manipulating of a ‘natural language text from one language into another without proper referencing to the original source’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  Based upon a summative, keyword search process, the evolution of a fuzzy, swarm-based platform for analysing academic sources across multiple language is a marked step forward in this field.  The underlying strategies associated with the Alzhahrani et al. system design include the summary of the text in question, the identification of native language keywords, a system crawl of resources in other languages with similar keywords, and finally, a detailed dictionary-based analysis of the original text versus the various search results.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>  This particular approach emphasises a unique contribution to the field of plagiarism scanning in the form of summarisation modelling of native and foreign textual vocabulary.  Further, the fuzzy swarm based analytical technique is an extremely forward-thinking methodology for analysing and identifying textual patterns and keywords in order to reduce the bandwidth and scope of intra-textual searching and review.</p>
<p>In order to develop this pioneering method, Alzahrani et al. identified five key sentence features that are used to score the various sentences throughout the native language text including sentence centrality, title feature, word sentence score, top word feature, and first sentence similarity.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>  One of the key tools required for the translation approach to the search-based analysis of foreign language tests is the Google AJAX Language API.  As of April 20<sup>th</sup>, 2012, Google had redefined this translation protocol as Google Translate API and offered a paid service which varies according to the millions of characters that are translated.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  One of the key values of the Google translate service is that it is accessible in multiple programming languages ranging from Ruby to Java to Python, allowing software developers to adapt a third party translation protocol that can be integrated into multiple online platforms.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  Once translated summaries have been compared across the breadth of foreign language resources featuring a keyword similarity outcome, the Alzahrani et al. model provides users with a similarity record that allows for threshold outcomes to reflect the likelihood of plagiarism and enables further in-depth document assessment by third party analysts (educators, adjudicators, etc.).<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a></p>
<p>Reflecting upon the multidimensionality of plagiarism, particularly in the context of English (L2) scholars, Pecorari defines the concept of ‘patchwriting’, which represents a holistic union between writer and source that is the direct result of a lack of competence and linguistic proficiency, rather than a purposeful incident of plagiarism.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a>  In fact, Stapleton suggests that there may be a direct bias in the frequency of plagiarism towards L2 students (when compared with English native speakers) due to both linguistic and attitudinal forces.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn8">[8]</a>  Importantly, it is the relative bias of the output scoring mechanisms (e.g. the originality report for Turnitin) which Stapleton suggests may falsely identify correlations in textual constructions, particularly when assessing non-native English speakers.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn9">[9]</a>  On the other hand, there is evidence to suggest that the influence of a formal checking mechanism may have a positive impact on the frequency of plagiarism, particularly when applied with concurrent explanatory supports from the classroom instructor.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn10">[10]</a>  In spite of the important questions raised in the Stapleton study, it is evident that generalisation and assumptions in plagiarism monitoring are continuing to perpetuate inconsistencies, mitigating the potential for more prescriptive solutions.</p>
<p>Given that the principle of obfuscation in multilingual plagiarism is based upon the manipulation of multiple linguistic dimensions (e.g. grammatical, semantic, structural), Ceska et al. propose that more universal mechanisms can be developed in order to limit the likelihood of scanner oversights.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn11">[11]</a>  Specifically, the researchers developed a multilingual database for multiple European languages that compares synsets (synonyms) and interconnects languages through an inter-lingual index.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn12">[12]</a>  Classified as a dictionary lemmitization method, the researchers propose that by developing a sufficient database of corresponding lemma within the euro-word-net thesaurus, pre-processing can significantly improve the capacity for cross-lingual plagiarism detection.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn13">[13]</a>  The empirical tests represent a unique breadth of multilingual assessment, whereby the linguistic factor is constrained in its potential for obscuring the detectability of translation-based copying; however, the research itself is indicative of a clear challenge in such model development: the complexity and depth of the database required for adequate assessment.  In fact, the Ceska et al. model is just one step forward in such processing technologies, failing to develop a sufficiently populated global thesaurus, whilst simultaneously demonstrating the viability which such techniques have in multilingual assessment.</p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> S. Alzahrani, M.S. Binwahlan, N. Salim, L. Sunmali, and C.K. Kent, ‘The Development of Cross-Language Plagiarism Detection Tool Utilising Fuzzy Swarm-Based Summarisation.’(2010)  <i>IEEE: 10<sup>th</sup> International Conference on Intelligent Systems Design and Applications</i>, pp. 86-90</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> S. Alzahrani, M.S. Binwahlan, N. Salim, L. Sunmali, and C.K. Kent, ‘The Development of Cross-Language Plagiarism Detection Tool Utilising Fuzzy Swarm-Based Summarisation.’(2010)  <i>IEEE: 10<sup>th</sup> International Conference on Intelligent Systems Design and Applications</i>, p. 86.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> S. Alzahrani, M.S. Binwahlan, N. Salim, L. Sunmali, and C.K. Kent, ‘The Development of Cross-Language Plagiarism Detection Tool Utilising Fuzzy Swarm-Based Summarisation.’(2010)  <i>IEEE: 10<sup>th</sup> International Conference on Intelligent Systems Design and Applications</i>, p. 88.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> Google. ‘Google Translate API.’ (2012), Online Resource.  Accessed on 10<sup>th</sup> November From: <a href="https://developers.google.com/translate/">https://developers.google.com/translate/</a>.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> Google. ‘Google Translate API.’ (2012), Online Resource.  Accessed on 10<sup>th</sup> November From: <a href="https://developers.google.com/translate/">https://developers.google.com/translate/</a>.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> S. Alzahrani, M.S. Binwahlan, N. Salim, L. Sunmali, and C.K. Kent, ‘The Development of Cross-Language Plagiarism Detection Tool Utilising Fuzzy Swarm-Based Summarisation.’ (2010) <i>IEEE: 10<sup>th</sup> International Conference on Intelligent Systems Design and Applications</i>, p. 89.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> D. Pecorari, ‘Good and Original: Plagiarism and Patchwriting in Academic Second-Language Writing,’(2003) <i>Journal of Second Language Writing</i>, Vol. 12, pp. 317-345.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref8">[8]</a> P. Stapleton, ‘Gauging the Effectiveness of Anti-Plagiarism Software: An Empirical Study of Second Language Graduate Writers,’ (2012) <i>Journal of English for Academic Purposes</i>, Vol. 11, pp. 125-133.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref9">[9]</a> P. Stapleton, ‘Gauging the Effectiveness of Anti-Plagiarism Software: An Empirical Study of Second Language Graduate Writers,’ (2012) <i>Journal of English for Academic Purposes</i>, Vol. 11, p. 130.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref10">[10]</a> P. Stapleton, ‘Gauging the Effectiveness of Anti-Plagiarism Software: An Empirical Study of Second Language Graduate Writers,’ (2012) <i>Journal of English for Academic Purposes</i>, Vol. 11, p. 132.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref11">[11]</a> Z. Ceska, M. Toma, and K. Jezek, ‘Multilingual Plagiarism Detection,’ In: D. Dochev, M. Pistore, and P. Traverso (Eds) <i>AMISA</i> (2008), pp. 83-92.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref12">[12]</a> Z. Ceska, M. Toma, and K. Jezek, ‘Multilingual Plagiarism Detection,’ In: D. Dochev, M. Pistore, and P. Traverso (Eds) <i>AMISA</i> (2008), p. 84</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref13">[13]</a> Z. Ceska, M. Toma, and K. Jezek, ‘Multilingual Plagiarism Detection,’ In: D. Dochev, M. Pistore, and P. Traverso (Eds) <i>AMISA</i> (2008), p. 85</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/cross-language-plagiarism/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.7 &#8211; 2.8: Fingerprinting, Game Series and Automation</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/fingerprinting-game-series/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/fingerprinting-game-series/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:40:10 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2370</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/fingerprinting-game-series/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>2.7 Fingerprinting and Student Identification Whilst many of the incidences of plagiarism identified in academic institutions are tied to an extra corpal[1]/cohort[2] outcome, the intra-corpal/cohort outcome (e.g. copying from a classmate’s essay) presents a unique concern for educators as classroom<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/fingerprinting-game-series/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<h2>2.7 Fingerprinting and Student Identification</h2>
<p>Whilst many of the incidences of plagiarism identified in academic institutions are tied to an extra corpal<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>/cohort<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a> outcome, the intra-corpal/cohort outcome (e.g. copying from a classmate’s essay) presents a unique concern for educators as classroom sizes and institutional flexibility (e.g. online courses) are continuing to advance.  The fingerprinting, document-tagging approach proposed by Weir et al. attempts to eliminate the potential for intra-textual exchange within classroom settings by prescribing explicit document tags for each student.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>  This protocol, although potentially vulnerable (e.g. hacking, document manipulation, backdoor access, etc.) serves as a front line control mechanism for student activities.  Importantly, Weir et al. propose that if any part of a student’s document is shared with their classmates, those underlying tags will also transfer, providing an immediate flag to indicate plagiarism as instructors begin their reviewing process.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  The primary limitation of such a text-to-text control structure is that it fails to address the more likely outcome of any student sharing practises in higher institutions: a summative, manipulative textual adoption that bypasses copying and word for word replication.</p>
<h2>2.8 Game Series and Automated Plagiarism Detection</h2>
<p>In order to facilitate the learning objectives of any degree level course, a range of assignments or examinations are prescribed via syllabus for the student population.  Recognising the opportunities afforded in today’s game-oriented, online-experienced society, Graven and MacKinnon reflect that there is significant potential to fundamentally revise the classwork schedule in order to not only maximise the utility of the education process, but to mitigate the potential for student plagiarism.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  This particular strategy involves the use of an automated, in-system scanning tool which incorporates the Turn-It-In module as the primary mode of comparison.  On the basis of empirical findings, Graven and MacKinnon argue that for more unsophisticated plagiarism, the system itself is a valuable addition to electronic learning; however, due to inadequate richness and fuzziness, it becomes much more difficult to detect robust, content-based response manipulation.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a></p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> K. Dey and M.A. Sobhan, ‘Impact of Unethical Practices of Plagiarism on Learning, Teaching and Research in Higher Education: Some Combating Strategies,’ <i>ITHET</i>, (2006) p. 2</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> G.R.S. Weir, M.A. Gordon, G. MacGregor. ‘Work in Progres—Technology in Plagiarism Detection and Management,’ (2004), <i>IEEE Frontiers in Education Conference</i>, p. 18.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> G.R.S. Weir, M.A. Gordon, G. MacGregor. ‘Work in Progres—Technology in Plagiarism Detection and Management,’ (2004), <i>IEEE Frontiers in Education Conference</i>, p. 19.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> G.R.S. Weir, M.A. Gordon, G. MacGregor. ‘Work in Progres—Technology in Plagiarism Detection and Management,’ (2004), <i>IEEE Frontiers in Education Conference</i>, p. 19.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> O.H. Graven and L.M. MacKinnon, ‘A Consideration of the Use of Plagiarism Tools for Automated Student Assessment’, <i>IEEE Transactions on Education</i>, Vol. 51, No. 2, pp. 212-220.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> O.H. Graven and L.M. MacKinnon, ‘A Consideration of the Use of Plagiarism Tools for Automated Student Assessment’, <i>IEEE Transactions on Education</i>, Vol. 51, No. 2, p. 219.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/fingerprinting-game-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.6 Fuzzy Models and Language Processing</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-models-and-language-processing/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-models-and-language-processing/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:38:23 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2368</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-models-and-language-processing/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>Recognising that paraphrasing or internal textual manipulation is continuing to avoid detection due to the inconsistent nature of the document fingerprints, Osman et al. suggest that a more immersive technique is needed based upon fuzzy logic and semantic role labelling<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-models-and-language-processing/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>Recognising that paraphrasing or internal textual manipulation is continuing to avoid detection due to the inconsistent nature of the document fingerprints, Osman et al. suggest that a more immersive technique is needed based upon fuzzy logic and semantic role labelling (SLR).<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  SLR involves the division of text into similar segments according to sentences, words, or topics, the deletion of meaningless words, and the introduction of a stemming algorithm to eliminate prefixes and generate the root word.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>  Once the SLR framing has completed, Osman et al. identified an if-then rule which utilises an ‘and’ operator to constrain the scope of the relationships, redirecting all possible rules into a singular equation<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>:</p>
<ul>
<li>IF (Similarity score of argument x in Sentence 1 is Important) and (Similarity score of argument x in Sentence 2 is Important) and (Similarity score of argument x in Sentence 3 is Important) and (Similarity score of argument x in Sentence 4 is Important) and (Similarity score of argument x in Sentence 5 is Important) THEN (argument x is Important).</li>
</ul>
<p>Applying these fuzzy rules and SLR decoding to over 1,000 documents, the researchers suggested that when applied to a traditional PAN-PC-09 standard dataset for plagiarism detection, this modified fuzzy rule-set had significantly improved performance over less comprehensive semantic based string similarity, graph based, and SLR-argument weight methods.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  In spite of such findings, it is evident that the Osman et al. methodology is simply an interwoven manipulation of an evolving foundation for semantic plagiarism detection which incorporates SLR and weighting protocols in order to effectively distinguish between meaningful and non-meaningful textual manipulations.</p>
<p>Interconnectivity, semantic similarities, and epistemological mirroring are characteristics of intrinsic plagiarism which continue to prove difficult for traditional platforms to detect.  Foudeh and Salim contribute a new ‘probabilistic ontology’ method which involves the use of both traditional (database-driven) and experimental (reasoning and quantitative probability analysis) techniques.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  The experimental reasoning engine employed in this particular model is problematic and identifies similar relational challenges to those highlighted in the Osman et al model.  In fact, Foudeh and Salim, unable to adequately reconcile the challenges of criteria setting and similarity rule making, argue that a more comprehensive training set is needed in order to effectively define accurate, consistent probabilistic threshold values.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>  From more intuitive analytical factors such as a reference-based linking analysis to epistemological reasoning assessment and comparison, these probabilistic models are designed to evolve over time, maximising the utility of their capacity for similarity analysis and plagiarism determination.</p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> A.H. Osman, N. Salim, Y.J. Kumar, and A. Abuobieda, ‘Fuzzy Semantic Plagiarism Detection,’ In: E. Hassanien et al. (Eds) <i>Advanced Machine Learning Technologies and Applications</i>, Vol. 332, pp. 543-553.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> A.H. Osman, N. Salim, Y.J. Kumar, and A. Abuobieda, ‘Fuzzy Semantic Plagiarism Detection,’ In: E. Hassanien et al. (Eds) <i>Advanced Machine Learning Technologies and Applications</i>, Vol. 332, p. 546</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> A.H. Osman, N. Salim, Y.J. Kumar, and A. Abuobieda, ‘Fuzzy Semantic Plagiarism Detection,’ In: E. Hassanien et al. (Eds) <i>Advanced Machine Learning Technologies and Applications</i>, Vol. 332, p. 547</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> A.H. Osman, N. Salim, Y.J. Kumar, and A. Abuobieda, ‘Fuzzy Semantic Plagiarism Detection,’ In: E. Hassanien et al. (Eds) <i>Advanced Machine Learning Technologies and Applications</i>, Vol. 332, p. 552</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> P. Foudeh and N. Salim. ‘A Holistic Approach to Duplicate Publication and Plagiarism Detection Using Probabilistic Ontologies.’ In: E. Hassanien et al. (Eds) <i>Advanced Machine Learning Technologies and Applications</i>, Vol. 332, pp. 566-574.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> P. Foudeh and N. Salim. ‘A Holistic Approach to Duplicate Publication and Plagiarism Detection Using Probabilistic Ontologies.’ In: E. Hassanien et al. (Eds) <i>Advanced Machine Learning Technologies and Applications</i>, Vol. 332, p. 573.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/fuzzy-models-and-language-processing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.5 Emergent External Plagiarism Detection Models</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/emergent-external-plagiarism-detection/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/emergent-external-plagiarism-detection/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:37:38 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2366</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/emergent-external-plagiarism-detection/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>One of the most significant challenges in external plagiarism detection is the breadth of the corpus data that is set as the input for analytical systems.  Micol et al. reflect that in order to reconcile the scope of information management<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/emergent-external-plagiarism-detection/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>One of the most significant challenges in external plagiarism detection is the breadth of the corpus data that is set as the input for analytical systems.  Micol et al. reflect that in order to reconcile the scope of information management associated with such complexities, programmes are now incorporating natural language processing (NLP) to determine whether intra-textual approximations and similarities can be more easily (less cost, less time consuming) identified.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  Within such pre-processing techniques, corpus filtering is an important stage which allows systems to minimise the scope of analysis and irrelevant document scanning.  To accommodate such processes, Micol et al. propose two different methods of corpus filtering including a full text search engine which indexes and filters the database, and a document similarity measure which compares semantic information through grammatical expansion in order to detect obfuscation.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>  Through a critical comparison of these two distinct methodologies, the researchers determined that the full text search engine method was less proficient in its parsing methods, requiring a much smaller assessment set.  Whilst the performance outcomes in relation to plagiarism detection were similar, the findings reveal that there is potential for extrinsic detection methods that employ similarity and semantic analysis modules.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a></p>
<p>By definition, Oberreuter et al. suggest that external plagiarism involves the comparison of suspicious documents with a set of possible references; however, the parameters of deviation and foundations of detection continue to vary according to external platform.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  Accordingly, the researcher propose a more systematic approach to detection strategies which employs the n-gram technique for narrowing the search space within the external corpus.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  Focusing on the link between intrinsic and extrinsic detection techniques, Oberreuter et al. cite an intrinsic n-gram profiling technique which focuses on quantifying style variation according to dissimilarity measures throughout the text in question.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>  Building upon the work of Stamatos, such intrinsic assessment protocol is designed to identify and quantify variations in the writer’s style, allowing reviewers to systematically determine whether there is a high likelihood of plagiarism.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a>  On the external side of this analysis, Oberreuter et al. contribute to the intrinsic detection methodology by modelling the closeness of documents using a distance-based outlier detection approach applied to an academic corpus.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn8">[8]</a>  Such research highlights the comprehensive nature which more evolutionary platforms are pursuing in their incorporation of both intrinsic and extrinsic analysis methodologies.</p>
<p>Authorship and intra-textual comparisons continue to operate at the root of plagiarism detection models; however, Suzuki argue that rather than prioritising grammatical or voice-based stylisation, there are other indicators that should be considered in such approaches.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn9">[9]</a>  Specifically, the researchers identify morphemes (network indicators) and co-occurrence based concentration indicators for authorship analysis.  Within these particular indicators, there four particular characteristics including frequency of morphemes, basic indicators, network indicators, and co-occurrence based indicators, which are based upon lexical and morphological similarities between the target and the source texts.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn10">[10]</a>  Supplementing more conventional lexical indicators, the researchers argue that this new technique has extended value for authorship profiling and computational sociolinguistics, allowing for a clear definition of the author’s profile and character that is based solely on the characteristics of the texts.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn11">[11]</a>  The research itself is oriented towards a much broader spectrum of assessment than simply plagiarism detection; however, the findings offer a valuable insight into ways in which cross-comparison of grammatical and lexical fingerprints can serve as an intermediate detection platform for expediting the plagiarism analysis process.</p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> D. Micol, O. Ferrandez, and R. Munoz, ‘Information Retrieval Techniques for Corpus Filtering Applied to External Plagiarism Detection,’ In: R. Munoz et al. (Eds) <i>NLDB</i>, (2011), p. 101.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> D. Micol, O. Ferrandez, and R. Munoz, ‘Information Retrieval Techniques for Corpus Filtering Applied to External Plagiarism Detection,’ In: R. Munoz et al. (Eds) <i>NLDB</i>, (2011), p. 102-3</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> D. Micol, O. Ferrandez, and R. Munoz, ‘Information Retrieval Techniques for Corpus Filtering Applied to External Plagiarism Detection,’ In: R. Munoz et al. (Eds) <i>NLDB</i>, (2011), p. 109.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> G. Oberreuter, G. L’Hullier, S.A. Rios, and J.D. Velasquez, ‘Outlier-Based Approaches for Intrinsic and External Plagiarism Detection,’ In: A. Konig et al. (Eds)., <i>KES</i>, pp. 11-20.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> G. Oberreuter, G. L’Hullier, S.A. Rios, and J.D. Velasquez, ‘Outlier-Based Approaches for Intrinsic and External Plagiarism Detection,’ In: A. Konig et al. (Eds)., <i>KES</i>, p. 12</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> G. Oberreuter, G. L’Hullier, S.A. Rios, and J.D. Velasquez, ‘Outlier-Based Approaches for Intrinsic and External Plagiarism Detection,’ In: A. Konig et al. (Eds)., <i>KES</i>, p. 12</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> E. Stamatatos, ‘Intrinsic Plagiarism Detection Using Character n-Gram Profiles.’ In: B. Stein, P. Rosso, E. Stamatatos, M. Koppel, and E. Agirre, (Eds). <i>SEPLN</i>, (2009), pp. 38-46.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref8">[8]</a> G. Oberreuter, G. L’Hullier, S.A. Rios, and J.D. Velasquez, ‘Outlier-Based Approaches for Intrinsic and External Plagiarism Detection,’ In: A. Konig et al. (Eds)., <i>KES</i>, p. 14.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref9">[9]</a> T. Suzuki, S. Kawamura, F. Yoshikane, K. Kageura, and A. Aizawa, ‘Co-Occurrence Based Indicators for Authorship Analysis’, <i>Literary and Linguistic Computing</i>, (2012), Vol. 27, No. 2, pp. 197-214.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref10">[10]</a> T. Suzuki, S. Kawamura, F. Yoshikane, K. Kageura, and A. Aizawa, ‘Co-Occurrence Based Indicators for Authorship Analysis’, <i>Literary and Linguistic Computing</i>, (2012), Vol. 27, No. 2, p. 199.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref11">[11]</a> T. Suzuki, S. Kawamura, F. Yoshikane, K. Kageura, and A. Aizawa, ‘Co-Occurrence Based Indicators for Authorship Analysis’, <i>Literary and Linguistic Computing</i>, (2012), Vol. 27, No. 2, p. 210.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/emergent-external-plagiarism-detection/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.4  Alternative Models and Algorithm Targeting</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/alternative-models-algorithm-targeting/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/alternative-models-algorithm-targeting/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:36:16 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2363</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/alternative-models-algorithm-targeting/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>The tree-based structure model of plagiarism detection is becoming an important foundation in semantic text recognition and intrinsic plagiarism detection.  Expanding beyond the limitation of database sourcing, Tschuggnall and Specht focus on detecting plagiarism without using a reference corpus by<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/alternative-models-algorithm-targeting/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>The tree-based structure model of plagiarism detection is becoming an important foundation in semantic text recognition and intrinsic plagiarism detection.  Expanding beyond the limitation of database sourcing, Tschuggnall and Specht focus on detecting plagiarism without using a reference corpus by processing and analysing the grammar of the document in question.  Although Dey and Sobhan differentiate between intra and extra corporal plagiarism<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>, Tschuggnall and Specht reflect that in detection protocols, external detection algorithms compare a suspicious document with a set of source documents, whilst intrinsic detection algorithms try to find sections by inspecting the suspicious document only.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>  Within the field of intrinsic detection, grammatical analysis algorithms such as n-grams or word frequency models are largely based upon the assumption of authoring and sentence structure similarities.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>  As an alternative to these previously tested techniques, Tschuggnall and Specht propose a new ‘plag-inn’ algorithm which utilizes sentence boundary detection to identify the beginning and ending of sentences, a comparison of the grammatical structure according to a tree-based model, and a calculation of distance between each grammar tree.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a>  The technique is unique, as it generates both visual and quantitative evidence of variations in the authoring of a particular text, limiting the need for any additional external resources in the initial plagiarism assessment process.</p>
<p>Extending the Tschuggnal and Specht corpus-free, intrinsic plagiarism detection model, Myer zu Eissen and Stein based their detection techniques on the evolution of plagiarism taxonomy towards a stylistic analysis.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  In order to effectively quantify the ‘natural parts’ of an author’s unique writing style, the researchers proposed an assessment of several key features as follows<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>:</p>
<ul>
<li>Stylometric Features: Quantify aspects of the writing style according to five primary categories including text statistics, syntactic features, part-of-speech features, closed-class word sets (special words), and structural features.</li>
<li>Averaged Word Frequency Class: Frequency class of a word is directly connected to Zipf’s law and is used as an indicator of a word’s customariness.  This classification bases the frequency of word usage according to the overall model of the text corpus based upon class-based customariness.</li>
</ul>
<p>Acknowledging the lack of an adequate ‘reference collection’ for this particular experiment, Myer zu Eissen and Stein developed a new corpus based upon four corpus linguistic criteria including authenticity and homogeneity, possibility of types of plagiarism, processable for human and machine, and clear separation of text and annotations’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a>  The assessment process involved the creation of 450 documents (from published sources) with 3-6 plagiarized passages in each, that were decomposed into 50-100 unique passages from which the feature vectors were computed.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn8">[8]</a>  The outputs of this particular study were extremely limited; however, it is clear from the three primary factors (frequency, preposition number, and sentence length), that it is statistically possible to identify changes in writer voice through the assessment of grammatical elements.</p>
<p>Recognising a breadth of deficiencies in existing detection models, Sandhya and Chitrakala propose an alternative document retrieval and plagiarism detection system which employs a non-traditional technique entitled the ‘multi-layer self-organizing map’ (MLSOM).<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn9">[9]</a>  This particular system utilises a clustering algorithm to represent comparative documents in a tree structure which is extracted by partitioning the document into pages and paragraphs.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn10">[10]</a>  Focusing on ‘type synonymy’, the researchers attempt to identify similarities across the paragraphs in the multiple documents, wherein semantic similarity computations are performed on a word by word basis.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn11">[11]</a>  Importantly, the tree structure employed in this particular model involves a layer loop analysis in which the MLSOM algorithm extracts and delimits paragraphs, conducts a word by word assessment, and identifies both exact and paraphrase linkages.  The Sadhya and Chitrakala research is an important step forward in the automation of plagiarism detection, particular when attempting to reconcile one of the most serious disadvantages of these electronic systems: the nature of semantic variability and breadth of student-based linguistic manipulation.</p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> K. Dey and M.A. Sobhan, ‘Impact of Unethical Practices of Plagiarism on Learning, Teaching and Research in Higher Education: Some Combating Strategies,’ <i>ITHET</i>, (2006).</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> M. Tschuggnall and G. Specht, ‘Plag-Inn: Intrinsic Plagiarism Detection Using Grammar Trees,’ In: Bouma et al. (Eds.). <i>NLDB</i>, (2012), p. 284.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> M. Tschuggnall and G. Specht, ‘Plag-Inn: Intrinsic Plagiarism Detection Using Grammar Trees,’ In: Bouma et al. (Eds.). <i>NLDB</i>, (2012), p. 285</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> M. Tschuggnall and G. Specht, ‘Plag-Inn: Intrinsic Plagiarism Detection Using Grammar Trees,’ In: Bouma et al. (Eds.). <i>NLDB</i>, (2012), p. 285.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> S. Myer zu Eissen and B. Stein, ‘Intrinsic Plagiarism Detection,’ M. Lalmas et al. (Eds). <i>ECIR</i>, (2006), pp. 565-569.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> S. Myer zu Eissen and B. Stein, ‘Intrinsic Plagiarism Detection,’ M. Lalmas et al. (Eds). <i>ECIR</i>, (2006), p. 566-7.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> S. Myer zu Eissen and B. Stein, ‘Intrinsic Plagiarism Detection,’ M. Lalmas et al. (Eds). <i>ECIR</i>, (2006), p. 567.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref8">[8]</a> S. Myer zu Eissen and B. Stein, ‘Intrinsic Plagiarism Detection,’ M. Lalmas et al. (Eds). <i>ECIR</i>, (2006), p. 567</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref9">[9]</a> S. Sandhya and S. Chitrakala, ‘Plagiarism Detection of Paraphrases in Text Documents with Document Retrieval,’ In: <i>Advances in Computing and Information Technology</i> Vol. 198, pp. 330-338.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref10">[10]</a> S. Sandhya and S. Chitrakala, ‘Plagiarism Detection of Paraphrases in Text Documents with Document Retrieval,’ In: <i>Advances in Computing and Information Technology</i> Vol. 198, p. 332.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref11">[11]</a> S. Sandhya and S. Chitrakala, ‘Plagiarism Detection of Paraphrases in Text Documents with Document Retrieval,’ In: <i>Advances in Computing and Information Technology</i> Vol. 198, p. 332.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/alternative-models-algorithm-targeting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.3 Semantic Similarity and Intrinsic Detection</title>
		<link>http://www.plagiarismchecker.net/plagiarism-detection/semantic-similarity-intrinsic-detection/</link>
		<comments>http://www.plagiarismchecker.net/plagiarism-detection/semantic-similarity-intrinsic-detection/#comments</comments>
		<pubDate>Thu, 28 Mar 2013 21:35:34 +0000</pubDate>
		<dc:creator>Plagiarism Checker</dc:creator>
				<category><![CDATA[Plagiarism Detection Methods]]></category>

		<guid isPermaLink="false">http://www.plagiarismchecker.net/?p=2361</guid>
		<description><![CDATA[<a href="http://www.plagiarismchecker.net/plagiarism-detection/semantic-similarity-intrinsic-detection/"><img align="left" hspace="5" width="150" height="150" src="http://www.plagiarismchecker.net/wp-content/plugins/thumbnail-for-excerpts/tfe_no_thumb.png" class="alignleft wp-post-image tfe" alt="" title="" /></a>By definition, semantic plagiarism is more complex than ‘cut and paste’ efforts, wherein unique ideas are integrated from sources into a student’s own work.[1]  Building upon the concept of textual summarisation championed by Alzhahrani et al[2], Kent and Salim propose<span class="ellipsis">&#8230;</span><div class="read-more"><a href="http://www.plagiarismchecker.net/plagiarism-detection/semantic-similarity-intrinsic-detection/">Read more &#8250;</a></div><!-- end of .read-more -->]]></description>
				<content:encoded><![CDATA[<p>By definition, semantic plagiarism is more complex than ‘cut and paste’ efforts, wherein unique ideas are integrated from sources into a student’s own work.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn1">[1]</a>  Building upon the concept of textual summarisation championed by Alzhahrani et al<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn2">[2]</a>, Kent and Salim propose that the fuzzy swarm approach to key text comparison of sentence similarity offers a unique mechanism for detecting semantic similarity and narrowing the scope of evidence used for determining plagiarism.  An essential component of this model is the ability to distinguish between two examples of student work via intrinsic detection.  Li et al. developed a specific methodology in which word similarity and order similarity are compared utilising word vectors from two chosen pairs of sentences.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn3">[3]</a>  Such modelling is based upon matrix operations and singular value decomposition (SVD), whereby it is possible to introduce scaling controls in order to limit the breadth of semantic analysis.  Specifically, words that operate within the higher layers of the ‘hierarchical semantic net’ have less similarity between words than those at lower layers, whereby the calculations themselves are applied to a monotonically increasing function to accurately define the depth of the search process.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn4">[4]</a></p>
<p>The most progressive inclusion in the Li et al. semantic analysis model is the incorporation of a Princeton University database called WordNet.  The database is defined as a ‘large lexical database of English in which nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms, each expressing a distinct concept’.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn5">[5]</a>  These synsets (cognitive synonyms) are distributed in WordNet according to a hierarchical structure in which language features are grouped according to their progressively specific relationship, allowing for a pathway between the words across the lexical tree.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn6">[6]</a>  This lexical knowledge base mirrors human understanding of words in natural language usage and allows for modelling semantic similarity across the intra-textual inputs.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn7">[7]</a>  The output of this analytical process is a word order and sentence order threshold assessment, whereby plagiarism is statistically measured and its likelihood can be defined according to a percentage basis.</p>
<p>The applicability of the SVD analysis in plagiarism evolves out of a vector based mapping protocol traditionally reserved for geographic, climatologic, and other complex technical problems.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn8">[8]</a>  To test the value of this protocol, Ceska conducted a systematic assessment of documents, applying six different stages which ultimately result in a normalised, summative output of plagiarised similarities.  These six stages can be explicated as follows<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn9">[9]</a>:</p>
<ul>
<li>Text Pre-Processing:  Based on natural language processing (NLP) tasks, stop-word removal (removal of all common/inconvenient words from text) and lemmatization (process of determining the lemma (e.g. context, part of speech) for a given word) is applied to the source document</li>
<li>Phrase Extraction: Retrieves simple ideas from the text, specifically the word N-grams of a specified length from pre-processed text (1-5 words).  Note: the accuracy of method decreases as quickly as the length of the N-grams increases.</li>
<li>Phrase Analysis and Reduction: Document frequency (DF) feature selection protocol which allows for phrases existing in just one document to be removed right away.  Also eliminates common phrases contained in more than , where  is the mean document frequency and  is the standard deviation from the mean document frequency.</li>
<li>Building a Document Model: Phrase by document model which considers occurrence frequencies of phrases in the documents, whereby relationships can be depicted in matrix form.</li>
<li>Latent Semantic Analysis: Infer latent semantic associations among phrases in the documents.  SVD is employed to decompose the initial matrix into three independent matrices.  All matrices are then decomposed in a reduced latent space.</li>
<li>Document Similarity Normalization: Compute mutual pairwise document similarity.  Correlation calculations are performed to identify possible intersections and similarity is graphically modelled.</li>
</ul>
<p>The Ceska SVD model is an important step forward in addressing the high degree of semantic variability associated with student work and the potentiality for intrinsic, strategic manipulations of the source document.  In a more recent semantic analysis model presented by Ceglarek and Haniewicz, the primary objective in dissection and compression is to identify passages of similar but altered text that was ‘borrowed’ from an original document.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn10">[10]</a>  The researchers leverage a sentence hashing technique which serves to normalise the text according to the number of sentences separated by a full stop.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn11">[11]</a>  Further dissection in the hashing process involves the division of sentences into text frames, whereby each term in the frame is mapped onto a unique term and a value for the frame is computed based upon the sum of numbers representing the terms.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn12">[12]</a>  One of the primary challenges of this particular approach, however, is determining the appropriate length of the text frame, a factor which Ceglarek and Haniewicz suggest is relative to the language and domain of the document in question.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn13">[13]</a>  When the assessment algorithm is then applied to this hashing sequence, key matches are detected and the frame numbers are updated to reflect the degree of matching across the frame sequence.  Given that this particular approach was originally rejected due to the length of processing time required, similar to the normalisation techniques employed in Ceska, the Ceglarek and Haniewicz introduction of sentence hashing is an important step forward in pre-processing and system efficiency achievements.</p>
<p>Whilst many of the semantic plagiarism resources are geared towards the comprehensive assessment of essays and dissertations, the increase popularity and prevalence of e-homework and e-classwork suites is resulting in a more advanced protocol that can offer detection value within various document and textual classes.  Xiaoping et al. build upon the Samuel et al.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn14">[14]</a> model of dashboard-based e-plagiarism detection in order to develop a revised platform based upon the vector space model (VSM).<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn15">[15]</a>  In fact the protocol operates as a gateway resource for instructors, rejecting texts that are identified as plagiarised or containing a degree of plagiarism that exceeds the system threshold.  The module-based architecture is composed of the CMS upload queue, a plagiarism detection module, and the detection result management module, significantly reducing the work or responsibility required of the e-classroom instructors.<a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftn16">[16]</a></p>
<div>
<hr align="left" size="1" width="33%" />
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref1">[1]</a> C.K. Kent and N. Salim, ‘Web Based Cross Language Semantic Plagiarism Detection’, <i>IEEE 9<sup>th</sup> International Conference on Dependable, Automatic, and Secure Computing</i>, (2011), pp. 1096-1102.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref2">[2]</a> S. Alzahrani, M.S. Binwahlan, N. Salim, L. Sunmali, and C.K. Kent, ‘The Development of Cross-Language Plagiarism Detection Tool Utilising Fuzzy Swarm-Based Summarisation.’ <i>IEEE: 10<sup>th</sup> International Conference on Intelligent Systems Design and Applications</i>, pp. 86-90</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref3">[3]</a> Y.Li, D. McLean, Z.A. Bandar, J.D. O’Shea, and K. Crockett.’ Sentence Similarity Based on Sematic Nets and Corpus Statistics.’ <i>IEEE Knowledge and Data Engineering, </i>(2006), Vol. 18, No .8.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref4">[4]</a> Y.Li, D. McLean, Z.A. Bandar, J.D. O’Shea, and K. Crockett.’ Sentence Similarity Based on Sematic Nets and Corpus Statistics.’ <i>IEEE Knowledge and Data Engineering, </i>(2006), Vol. 18, No .8,</p>
<p>p. 1142.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref5">[5]</a> WordNet, ‘What is WordNet?’ (2012), Online Resource.  Accessed on 10<sup>th</sup> November From: <a href="http://wordnet.princeton.edu/">http://wordnet.princeton.edu/</a>.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref6">[6]</a> Y.Li, D. McLean, Z.A. Bandar, J.D. O’Shea, and K. Crockett.’ Sentence Similarity Based on Sematic Nets and Corpus Statistics.’ <i>IEEE Knowledge and Data Engineering, </i>(2006), Vol. 18, No .8, p. 1144.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref7">[7]</a> Y.Li, D. McLean, Z.A. Bandar, J.D. O’Shea, and K. Crockett.’ Sentence Similarity Based on Sematic Nets and Corpus Statistics.’ <i>IEEE Knowledge and Data Engineering, </i>(2006), Vol. 18, No .8, p. 1148.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref8">[8]</a> Z. Ceska, ‘Plagiarism Detection Based on Singular Value Decomposition,’ In: A. Ranta and B. Nortstrom (Eds) <i>Advances in Natural Language Processing</i>, (2008).</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref9">[9]</a> Z. Ceska, ‘Plagiarism Detection Based on Singular Value Decomposition,’ In: A. Ranta and B. Nortstrom (Eds) <i>Advances in Natural Language Processing</i>, (2008), pp. 110-14.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref10">[10]</a> D. Ceglarek and K. Haniewicz, ‘Fast Plagiarism Detection by Sentence Hashing,’ In: L. Rutkowski et al. (Eds.), <i>Artificial Intelligence and Soft Computing</i>, (2012), p. 30</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref11">[11]</a> D. Ceglarek and K. Haniewicz, ‘Fast Plagiarism Detection by Sentence Hashing,’ In: L. Rutkowski et al. (Eds.), <i>Artificial Intelligence and Soft Computing</i>, (2012), p. 31</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref12">[12]</a> D. Ceglarek and K. Haniewicz, ‘Fast Plagiarism Detection by Sentence Hashing,’ In: L. Rutkowski et al. (Eds.), <i>Artificial Intelligence and Soft Computing</i>, (2012), p. 32</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref13">[13]</a> D. Ceglarek and K. Haniewicz, ‘Fast Plagiarism Detection by Sentence Hashing,’ In: L. Rutkowski et al. (Eds.), <i>Artificial Intelligence and Soft Computing</i>, (2012), p. 35</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref14">[14]</a> Z. Xiaoping, M. Xiaoxuan, and S. Honghong, ‘Research on a VSM-Based E-Homework Anti Plagiarism System,’ <i>IEEE International Conference on Information Management, Innovation Management, and Industrial Engineering</i>, (2012), 102-105.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref15">[15]</a> N. Samuel, N. Samuel, and S. Butakov, ‘XML Based Format for Exchange of Plagiarism Detection Results,’ (2010) <i>IEEE Information Science and Applications Conference</i>, pp. 1-6.</p>
</div>
<div>
<p><a title="" href="file:///C:/Users/Pooky/AppData/Local/Temp/JenPlagiarism-1.docx#_ftnref16">[16]</a> N. Samuel, N. Samuel, and S. Butakov, ‘XML Based Format for Exchange of Plagiarism Detection Results,’ (2010) <i>IEEE Information Science and Applications Conference</i>, p. 5.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismchecker.net/plagiarism-detection/semantic-similarity-intrinsic-detection/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
